Data Scrape on Videogame Websites and Database Creation

Cancelled Posted 4 years ago Paid on delivery
Cancelled

Phase I: Data Scrape

1. Web scrape - detailed web scrape across provided web-sites to capture historical data across all seasons and profile statistics.

a. Program pull based on: username; ID; player ID .

b. One tab (including all fields required for individuals and teams) for each game.

c. Scrape the top 500 players and top 25 teams for all historical and current seasons.

d. Distinguish fields for individuals vs. teams.

e. Data automatically scraped at interchangeable (1h, 1d, 1w, etc.) intervals.

f. Timeline: pull from earliest recorded season through each and every season to the current season.

2. Time range - allow updated statistics to be selected in specific time frames (e.g. a time range consisting of days, weeks, seasons, etc.).

3. Data output - .csv, .xlsx and all relevant easy, editable formats, allowing for any potential reconciliation.

4. Games and provided websites for web scrape -

a. Apex Legends [login to view URL]

b. Fortnite [login to view URL]

c. Clash Royale [login to view URL]

d. League of Legends [login to view URL]

e. Overwatch [login to view URL]; [login to view URL]

f. Counter Strike Global Offensive [login to view URL]

g. DOTA [login to view URL]

Phase II: Data build and sort framework

1. Data build – The provided source websites sources for each game’s web scrape will be prioritized, so that those down the website rank are only used to build data, not replace. E.g. If website ranked 1 does not have value for field X, the program will move to website ranked 2 to fill data for field X.

2. Data reconciliation - reconcile incongruent data sets across different websites.

3. Data ranking – use standard algorithm (ranking based on population set) to rank fields, players and teams by weighting fields and seasons. Allow algorithm to select/deselect fields and change weightings.

4. Data sort – provide easily selectable option to sort by field, player, season, range of seasons (previous season, last x seasons, all seasons, etc.).

5. Data correlations – build correlations of individual fields to overall ranking.

6. Data output –

a. Dropdown search with comparison – allow dropdown to choose player or team by either ranking or name with a comparison search of up to 2 other selections. Include option to display particular groupings as per note 2. Display player’s/team’s current season and all-time ranking above all field statistics.

b. Research output – automatic chart load: progression of ranking over time; option to change to field ranking; 3 highest correlations of ranking to players/teams’ fields; top 3 field improvements required for optimized ranking for self-improvement and/or to comparison set.

c. Game search and output – dropdown to choose game or season. Automatic chart loads: top 3, top 10, player’s/team’s movements over time; most improved player/team; momentum best/worst performance; top 100 breakthroughs; top 10 breakthroughs.

Notes:

1. All data columns (and potentially rows) to have a one click sort feature.

2. Allow option to group fields into 3 buckets, so that on output, only one set of fields are shown.

3. Add-ons – require current season data to continue to be updated; template auto load for next season, manual add-on should be easy.

Bottom should be how it should look but with more fields.

Data Mining Excel PHP Software Architecture Web Scraping

Project ID: #19789475

About the project

Remote project Active 4 years ago