Skip to content
Home » Data Brawl 2021-2022

Data Brawl 2021-2022

  • by

Hope you have been enjoying our events and lectures. To give everyone a chance to learn via doing, we present to you the 2021-2022 Data Brawl. This is a beginner-oriented data science mini-competition, where you will be tasked to generate meaningful insights from one of three datasets we will provide. We have a prize pool of at least £200 up for grabs!

Update: deadline has been extended to 19/12/2021 10:00 pm!

As this is a mini-competition, we do not expect you to spend overly long on this task (some of you might also have exams) ! Expect spending about one or two days maximum on the task.

Working in teams is also much more fun compared to working in Feel free to use time after the lectures to look for teammates!


Task

  • Using one of the three datasets below, formulate your own research question and submit a notebook with your findings.
  • The research question should be relevant and interesting to you!
  • There aren’t any more specifications on this, but in general we expect some exploratory data analysis, visualisations and applications of statistics and/or machine learning algorithms

You will be judged by committee members of ICDSS.

The scoring breakdown

Demonstration of your technical skills encompass 50% of the mark

  • Data Exploration: techniques used to understand the dataset (10%)
  • Insight visualisation: quality, relevance and effectiveness of visualisations (10%)
  • Analytical Techniques: correctness of method and justification (25%)
  • Model Validation: using metrics to quantify performance (5%)

The other 50% of your score will be based on the presentation of your findings

  • Creativity: how original is your angle of exploration (15%)
  • Story and interpretation of results: what narrative weaves your work together? How do your findings answer the research question? (35%)

Rules and Submission

  • You may compete in groups of 1-3 people per team. Participants MUST be a member of ICDSS registered with Imperial College Union and be an undergraduate or masters student at Imperial.
  • Submissions must be completed by 19/12/2021 10:00 pm in the form of a notebook
  • Winners will be expected to write a blog post about their submission for ICDSS and optionally do an additional presentation for members to watch!

Link to the Team sign up form: https://forms.office.com/r/4mPjBJzpgH. This MUST be completed by the deadline.

Link to the Devpost for submissions: https://data-brawl-2021.devpost.com/

Prizes

  • The prize pool will be at least worth £200– likely to be in the form of vouchers
  • There will be 9 prizes— a winner and two honourable mentions for each dataset

Datasets

Introvert vs Extrovert Prediction 👩‍💻

Link to dataset

The age old question– do you think an introvert or an extrovert? This is a collection of 7000 answers to 282 personality questions. Can you correlate their answers with whether they self identify to be outgoing or not? Do you think they’re telling the truth? Which questions in the survey are the most powerful?

If you choose this dataset you will be doing some interesting work in psychology and the social sciences!

Student Alcohol Consumption…. and grades? 🍺 📒

Link to dataset

The data were obtained in a survey of students math and Portuguese language courses in secondary school. It contains a lot of interesting social, gender and study information about students. You can use it for some EDA or try to predict students final grade.

Prediction of Music Genre? 👩‍🎤🎺🎻

Link to Dataset

This is a big dataset to do with the characteristics of music of different genres. It’s very comprehensive so see if you can make sense of the data!