為了快速將部落格匯出的純文字 MT 檔中每張照片都下載回本地備份或後續運用,特別與 ChatGPT 合作寫出了這段 Python 腳本。隨意窩 8 月底要關站了,這個小程式應該可以幫到不少人忙,如果有幫助到你,歡迎透過右邊按鈕進入贊助,謝謝。
GitHub:https://github.com/qwe987299/MTfileGetImagesToLocal
使用方式
1. 首先將要處理的純文字 MT 檔改名 input.txt 並與 run.py 放在一起。
2. 雙擊 run.bat 批次檔開始運作。
3. 終端機看到「ALL DONE!!!」代表完成。
4. images 子目錄存放所有下載回來的圖檔,output.txt 則是修改完網址的新純文字 MT 檔。
完整程式碼
import os
import re
import requests
def download_image(url, output_dir):
num_retries = 2
for i in range(num_retries):
try:
response = requests.get(url)
response.raise_for_status()
image_name = os.path.basename(url)
with open(os.path.join(output_dir, image_name), "wb") as f:
f.write(response.content)
print(f"Downloaded {image_name} to {output_dir}")
break
except (requests.exceptions.RequestException, IOError) as e:
print(f"Failed to download {url} (attempt {i+1}/{num_retries})")
if i == num_retries - 1:
print(f"Gave up downloading {url}: {str(e)}")
def main(input_file):
# Create output directory
output_dir = "images"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Open input file
with open(input_file, "r", encoding='utf-8') as f:
lines = f.readlines()
# Find all image URLs and download them
img_urls = []
for line in lines:
matches = re.findall(r'<img\s.*?src="(.*?)".*?/>', line)
for match in matches:
img_urls.append(match)
# Remove duplicates from the image URL list
img_urls = list(set(img_urls))
# Download all images
for img_url in img_urls:
download_image(img_url, output_dir)
# Replace image URLs in input file
output_lines = []
for line in lines:
replaced_urls = [] # List to store replaced URLs for the current line
output_line = line # Create a copy of the original line to modify
for img_url in img_urls:
if img_url not in replaced_urls: # Check if URL already replaced
output_line = output_line.replace(
img_url, f"{output_dir}/{os.path.basename(img_url)}")
replaced_urls.append(img_url) # Add URL to replaced list
output_lines.append(output_line)
# Clear replaced URLs list for the next line
replaced_urls = []
with open(f"output.txt", "w", encoding='utf-8') as f:
for line in output_lines:
f.write(line)
if __name__ == "__main__":
input_file = "input.txt"
main(input_file)
print(f"ALL DONE!!!")
▲ 要匯入處理的純文字 MT 檔必須先改名 input.txt,可以看到裡面很多遠端影像網址,稍後程式會自動判斷並下載。執行 run.bat 批次檔開始運作。
▲ run.bat 批次檔開始運作了!會有下載成功與否的資訊在終端機上。
▲ 終端機看到「ALL DONE!!!」代表完成。
▲ 專案中的 images 子目錄存放所有下載回來的圖檔,output.txt 則是修改完網址的新純文字 MT 檔。
贊助廣告 ‧ Sponsor advertisements
留言區 / Comments
萌芽論壇