Intraday.my - identification of cover duplication
Overall:
On this site every image with a srcset has an URL to it's original version, and URLs with concrete sizes at the end. Since the cover doesn't always use the same size version as it's duplicate in the article, first I get all images original version with regex (it's also a must to improve quality). Than if there's an image in the body with the EXACT SAME URL as the cover, I remove the cover, because
"The cover must not be set if: The cover image is duplicated in the article".
Two images with the exact same URL can't be different.
Regex:
The template replaces the size-mark only before the file-extension if exists, at the images that have srcset. If an img doesn't have size-mark at the end of it's URL or doesn't have srcset, it won't replace anything. If it has srcset, and has size-mark at the end, it must have an original version too, so it's not possible that it leads to invalid URLs.
Potentially difficult cases:
Image that has size-mark before the file-extension:
https://intraday.my/wp-content/uploads/2019/01/intraday-scrape-1200x628px-750x430.png
"1200x628" doesn't get replaced, only the dynamic "750x430" before .png:
https://intraday.my/wp-content/uploads/2019/01/intraday-scrape-1200x628px.png
So it can't cause invalid URL.
Image with dynamic lookalike size-mark at the end:
Deleting it leads to invalid URL:
But since it doesn't have an original version, it doesn't have srcset, so it won't be replaced.
Different images having nearly identical image URLs, where only the file-extension is different:
https://intraday.my/wp-content/uploads/2018/10/youtube.jpg
https://intraday.my/wp-content/uploads/2018/10/youtube-300x205.png
The second one has srcset and size-mark, so the template gets it's original size:
https://intraday.my/wp-content/uploads/2018/10/youtube.png
Only the sizemark gets removed, the file-extension is still different from the other one, so the image isn't the same - so it doesn't get removed. See:
Result: reliable working
Another example article beside the Analisis ones with duplicated image:
https://intraday.my/kisah-trader-misteri-jepun-jana-keuntungan-34-juta-ketika-ramai-orang-panik/
In my template: