##itertools 是一个非常强大的库，它提供了很多用于处理迭代操作的工具。但是，对于特定的问题，直接的算法可能会更加高效。
##在我们的情境中，我们要找的是两个字符串之间的所有公共子串。使用 itertools 可能会涉及生成所有可能的子串组合，然后再进行比较，这在某些情况下可能会导致不必要的计算。
##而我们使用的滑动窗口方法是基于以下观察结果的：
##如果两个字符串在某个位置有一个公共字符，那么我们可以尝试扩展这个匹配，直到找到一个公共子串或匹配失败为止。
##通过这种方式，我们可以立即找到一个公共子串，而不需要生成和比较所有可能的子串组合。
##因此，对于这个特定的问题，滑动窗口方法可能会比使用 itertools 更加高效。但这并不意味着 itertools 不是一个有用的库。对于其他类型的问题，itertools 可能会提供更简洁、更高效的解决方案。
##以下使用滑动窗口方法
##find_common_substrings_huadong_fix01
##fix01 返回的子串之间有相互包含的情况。我们在添加子串到结果集之前进行检查，可以检查新找到的子串是否包含在结果集中的任何子串中，或者结果集中的任何子串是否包含在新找到的子串中。

import openpyxl

def find_common_substrings(s1, s2, min_len=4):
    len1, len2 = len(s1), len(s2)
    results = set()
    
    for i in range(len1):
        for j in range(len2):
            if s1[i] == s2[j]:
                temp_len = 0
                while i + temp_len < len1 and j + temp_len < len2 and s1[i + temp_len] == s2[j + temp_len]:
                    temp_len += 1
                if temp_len >= min_len:
                    substring = s1[i:i+temp_len]
                    # 检查新子串是否包含在结果集中的子串，或者结果集中的子串是否包含在新子串中
                    if not any([substring in res for res in results]) and not any([res in substring for res in results]):
                        results.add(substring)
    return list(results)

def read_excel_content(filename):
    # 打开 Excel 文件
    workbook = openpyxl.load_workbook(filename)
    # 获取第一个工作表
    sheet = workbook.active
    # 读取第一列的所有数据并连接成一个字符串
    content = ''.join([str(cell.value) for cell in sheet['A']])
    return content

# 从 Excel 文件中读取数据
str1 = read_excel_content('file1.xlsx')
str2 = read_excel_content('file2.xlsx')

common_substrings = find_common_substrings(str1, str2)
print(common_substrings)