dify/extractor at main - dify

History

kurokobo 30deeb6f1c feat(firecrawl): follow pagination when crawl status is completed (#33864 ) Co-authored-by: Crazywoola <100913391+crazywoola@users.noreply.github.com>		2026-03-23 21:19:32 +08:00
..
blob	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
entity	Feat/update notion preview (#29345 )	2025-12-16 16:43:45 +08:00
firecrawl	feat(firecrawl): follow pagination when crawl status is completed (#33864 )	2026-03-23 21:19:32 +08:00
unstructured	refactor: use dynamic max characters for chunking in extractors (#26782 )	2025-10-13 10:22:59 +08:00
watercrawl	refactor(api): type WaterCrawl API responses with TypedDict (#33700 )	2026-03-19 10:35:44 +09:00
csv_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
excel_extractor.py	perf(core/rag): optimize Excel extractor performance and memory usage (#29551 )	2025-12-12 12:15:03 +08:00
extract_processor.py	fix: fix failed test (#33241 )	2026-03-11 09:37:19 +08:00
extractor_base.py	chore(api/core): apply ruff reformatting (#7624 )	2024-09-10 17:00:20 +08:00
helpers.py	fix: detect_file_encodings TypeError: tuple indices must be integers or slices, not str (#29595 )	2025-12-17 13:58:05 +08:00
html_extractor.py	chore: cleanup unnecessary mypy suppressions on imports (#24712 )	2025-08-28 23:17:25 +08:00
jina_reader_extractor.py	feat: knowledge pipeline (#25360 )	2025-09-18 12:49:10 +08:00
markdown_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
notion_extractor.py	fix: handle missing `credential_id` (#30051 )	2025-12-24 11:21:51 +08:00
pdf_extractor.py	refactor: use EnumText(StorageType) for UploadFile.storage_type (#33728 )	2026-03-19 15:15:32 +09:00
text_extractor.py	chore: add ast-grep rule to convert Optional[T] to T \| None (#25560 )	2025-09-15 13:06:33 +08:00
word_extractor.py	refactor(api): type bare dict/list annotations in remaining rag folder (#33775 )	2026-03-20 03:31:06 +09:00