I've noticed that many paragraphs and article sections have been copied and pasted from one Wikipedia article to another, leading to excessive amounts of redundant text on Wikipedia. Do any tools, scripts, or APIs exist that would make it possible to automatically identify these duplicate sections and paragraphs (so that they can be removed)?
Merging duplicate sections on Wikipedia
mediawikiwikipedia
Related Topic
- Wikipedia – How to See All Ambiguous Links on a Specific Wikipedia Page
- Reverting multiple edits at once on Wikipedia
- MediaWiki – Using Wikipedia Bots on Other Wikis
- Wikipedia – Automatically Generating References in Articles
- Merging categories on Wikipedia
- Watch for changes to a section of a Wikipedia page
- Tool to clean up Wikipedia articles where there are duplicate references
Best Answer
I'm afraid there's not any way to do this using the API or anything like that. However, you could probably do something with the Wikimedia dumps to find the sort of duplication you're looking for. The people already doing research might also be able to help you out.