sqlite> select count(*) from posts where content like '%<br>%';
33
sqlite> select count(*) from posts where content like '%<%';
361
Did I give you the wrong data set?
Even if you do have the wrong data set, you can just run my converter script from here: http://github.com/labster/taparip/tree/master/convert If you already have it in a different db, this will work if you change the connection string to use that DB. Though, uh, it's pretty easy to make mistakes parsing HTML if you've been doing it with regexes.
The correct data set was here, I just checked: http://www.dropbox.com/s/mhhcgbu5mujzb ... .gzip?dl=0-- ?×V
33
sqlite> select count(*) from posts where content like '%<%';
361
Did I give you the wrong data set?
Even if you do have the wrong data set, you can just run my converter script from here: http://github.com/labster/taparip/tree/master/convert If you already have it in a different db, this will work if you change the connection string to use that DB. Though, uh, it's pretty easy to make mistakes parsing HTML if you've been doing it with regexes.
The correct data set was here, I just checked: http://www.dropbox.com/s/mhhcgbu5mujzb ... .gzip?dl=0-- ?×V