TL;DR - Yes, web scraping is legal for a publicly available website. ie not behind a login. Personal information storage is different and comes under GDPR in the UK.

Update: 19th April 2022 US appeals court reaffirms

Twitter

For a project I’ve got a few thousand tweets URL’s that I need to get data and images/video for. They are > 7 days old.

So I can’t use the API to get the data.

Is it legal to scrape the data from the public website?

alt text

https://twitter.com/en/tos from the Terms of Service it looks like you are not allowed to scrape their data.

However here is a lawyers guiide https://blog.apify.com/is-web-scraping-legal/ to the legality.

Web scraping is legal if you scrape data publicly available on the internet. But some kinds of data are protected by international regulations, so be careful scraping personal data, intellectual property, or confidential data. Respect your target websites and use empathy to create ethical scrapers.

And here are 2 court cases https://brightdata.com/legality-of-web-data-collection

https://www.crawlnow.com/blog/is-web-scraping-legal See Legal Precedent. Specifically HiQ vs. LinkedIn. Sept 2019. https://law.justia.com/cases/federal/appellate-courts/ca9/17-16783/17-16783-2019-09-09.html

https://www.reddit.com/r/webscraping/comments/nb3a7t/is_scraping_twitter_commercially_legal/

Please see my next article on the the technical challenges and commercial alternatives from getting public information from the web.

Commercial Scrapers

There are many commercial web scrapers out there (that scrape Twitter too)

https://phantombuster.com/

https://apify.com/ who have a Twitter Scraper

https://brightdata.com/legality-of-web-data-collection

Code Scrapers

https://github.com/JustAnotherArchivist/snscrape

Doing good

https://apify.com/web-scraping#benefit-humanity

https://brightinitiative.com/