Got a 520 Error while trying to post this, natch.
Bob: Scroll to the bottom of the page and click "Legal". Agreed as far as I can tell, nowhere does it say that CrowdGather is Yuku, but that's a small detail, right? Right?
JFerio: I'm pretty sure that the ALLCAPS wins out; those are supposed to be important provisions. In fact, I would say that this section "expressly authorizes" taking a backup. IANAL, but if you tell a person that they have sole responsibility for doing something as part of a contract, as part of that same contract you cannot tell the person they are not allowed to use the tools required to complete their responsibilities.
If a scraper is slow it would be hard to tell, but not impossible. I'd probably look for patterns where the same IP hits a lot of different pages on the same domain. Most real users would have a pattern of switching between threads a lot, or reading only the latest posts in a thread. And they'd tend to not read every thread sequentially. And if you use a scraper, sometimes you get a unique user agent. So yeah, there's forensics you can do, even with low request rate.
But are they doing forensics? Gosh I hope not, with all of the errors we've been getting. With this much instability, I'd hope they were working on that instead.
-- ∇×V
Bob: Scroll to the bottom of the page and click "Legal". Agreed as far as I can tell, nowhere does it say that CrowdGather is Yuku, but that's a small detail, right? Right?
CrowdGather User Terms and Conditions Wrote:We encourage you to read through and review these Terms and Conditions each time you access the Site.ahahahahahahahahahahahaha
JFerio: I'm pretty sure that the ALLCAPS wins out; those are supposed to be important provisions. In fact, I would say that this section "expressly authorizes" taking a backup. IANAL, but if you tell a person that they have sole responsibility for doing something as part of a contract, as part of that same contract you cannot tell the person they are not allowed to use the tools required to complete their responsibilities.
If a scraper is slow it would be hard to tell, but not impossible. I'd probably look for patterns where the same IP hits a lot of different pages on the same domain. Most real users would have a pattern of switching between threads a lot, or reading only the latest posts in a thread. And they'd tend to not read every thread sequentially. And if you use a scraper, sometimes you get a unique user agent. So yeah, there's forensics you can do, even with low request rate.
But are they doing forensics? Gosh I hope not, with all of the errors we've been getting. With this much instability, I'd hope they were working on that instead.
-- ∇×V