I need to extract and sort some strings, crawling just 2 index of a certain website online.
Here's the info:
- The first index goes from A to Z
- The second index is always nested into the first one and goes from 0-100 to 800-900 (simply used to display 100 words at the time)
We need all these words splitted into some files in alphabetical order:
file 1: just the strings made by one word
file 2: 2 words strings
file 3: all 3 words strings
file 4: strings with more than 4 words
The total number of the strings is 20.000 more or less