求助！找不到 Python 代码 bug

# 读取 nameURL.txt 中的文件名和链接
with open('nameURL.txt', 'r', encoding='utf-8') as file:
    name_url_lines = file.readlines()

# 读取 remote_filename.txt 中已下载的文件名（去除后缀）
with open('remote_filename.txt', 'r', encoding='utf-8') as file:
    downloaded_files = [line.strip('.mp3') for line in file.read().splitlines()]

# 遍历每一行，检查文件名是否在已下载文件列表中，如果不在则写入 undownload_mp3.txt
with open('undownload_mp3.txt', 'w', encoding='utf-8') as file:
    for line in name_url_lines:
        filename, url = line.split(',')
        if filename not in downloaded_files:
            file.write(f'{filename.strip()},{url.strip()}\n')

print('未下载的文件名和链接已写入 undownload_mp3.txt 文件。')

测试遇到的问题，undownload_mp3.txt中竟然存在少量remote_filename.txt中的存储的已下载的文件名，例如第一行数据中的文件名The-Joshua-Generation-grows-up

实在找不到问题出在哪里，特来请教大家，谢谢

nameURL.txt链接： https://github.com/deyunwanxin/rawData/blob/main/nameURL.txt

remote_filename.txt链接： https://github.com/deyunwanxin/rawData/blob/main/remote_filename.txt

undownload_mp3.txt链接： https://github.com/deyunwanxin/rawData/blob/main/undownload_mp3.txt

txt

文件

nameurl

file

20 条回复 • 2024-01-23 21:13:27 +08:00

yiguanxianyu

335 天前

可以把有问题的几行单独提出来 debug

WoofZJ

335 天前

strip 不是这样用的，它你传”.mp3"实际上会把结尾有这四个字符之一就一直清除到不含，The-Joshua-Generation-grows-up.mp3 变成了 The-Joshua-Generation-grows-u

watry

335 天前

line.strip('.mp3') 删除的不是'.mp3'字符串，而是其中的任意字符，所以会把'p.mp3'一起去掉

WoofZJ

335 天前

字符串有个 removesuffix 方法，应该用这个 line.strip('.mp3')=>line.removesuffix('.mp3')

Persimmon08

335 天前

@WoofZJ 谢谢大佬指点，万分感谢

Persimmon08

335 天前

@Persimmon08 谢谢大佬，感谢指教

MiketsuSmasher

335 天前

楼市其他人已经解释得很清楚了，移除后缀不应该用 line.strip('.mp3')，看看官方文档，这个方法做的跟你想的根本不是一回事： https://docs.python.org/zh-cn/3/library/stdtypes.html#str.strip

要移除后缀，应该改成 line.removesuffix('.mp3')： https://docs.python.org/zh-cn/3/library/stdtypes.html#str.removesuffix

Persimmon08

335 天前

@yiguanxianyu 学艺不精，让大家见笑了

Persimmon08

335 天前

@MiketsuSmasher 谢谢指点，我去好好看看文档

NoOneNoBody

335 天前

1. name_url_lines downloaded_files 转为 set 做交集或差集更方便
2. 去除末端应该用 rstript 而不是 stript ，避免前端也匹配的不严谨情况
3. 去除末端一段字符串应该用正则，stripe/rstript 的参数是字符集，相当于{'.', 'm', 'p', '3'}无序任一，而不是有序的'.mp3'字符串，楼上已经说清楚了。而且，如果是 windows 系统，因为文件名大小写不区分，为了同时匹配大小写的情况，更应该用正则；其他系统则视乎需求

Persimmon08

335 天前

@NoOneNoBody 非常感谢，又学到了很多东西