All,
I am pleased to announce the People in Public (PIP) collection campaign.
This campaign will collect 600 different activities that people naturally perform in public and shared spaces like the home. Our goal is to collect data of three types:
-
New activities. We want to collect all of the activities that people perform in public places that are interesting. For example, eating, drinking, searching in bags, putting on clothes, throwing objects or cleaning. This has been an interesting exercise to define a list of things that people do. There are a lot and our list is by no means comprehensive.
-
Subtle activities. We want to collect activities that cover all of the subtle things that people do when they don’t think they are doing anything. For example, smoothing their hair, nudging their friends with their elbow or cracking their knuckles. We call this data “fine-grained activities” and we believe data like this will help reduce the false alarms that we described here:
- Activity variations. Finally, we want to collect data that covers the massive variation in ways that people perform activities with different objects. There are many activities that look different to machine learning when interacting with different objects, like carrying a bicycle vs. carrying a piece of furniture or closing a door with your foot when your hands are full vs closing a door with your hip. We call this “within-class variation” and we want more data of these cases to help the machine learning to not miss an activity when it is performed in a different way.
I want to give you a sense of the scale of the data we are collecting. For those of you who remember the MEVA data collection from last year, that collection focused on 37 activities and about 400K videos submitted to us. The PIP collection is targeting over 20x as many activities and around 1M videos. This will be the largest consented, publicly available dataset of people doing things ever created. I am super excited about what the research community can achieve with data like this to challenge the state of the art in visual AI.
We expect that this collection will be held over the next three months. You may submit every day, and we are targeting between 50-100 collections rolled out for you to choose from. The pricing of each collection depends on the complexity, so some pay more based on how long we expect them to take or how difficult they are to perform correctly. This is a big step up in complexity from our previous collections, so we will need to experiment with how many collections is too many for one pair to perform in a day, and how many different collections our review team can accurately check in a day. Your feedback is appreciated here. We are trying to do something that has never been done before, so your suggestions posted to the group here are invaluable for us to converge to a successful collection campaign.
The collections will be rolled out slowly over the next week so that we can create training videos from the best submissions for newer collectors. If you have not received the PIP project yet in-app, please be patient. We are working hard to push it out to you as soon as possible.
Thanks,
-Jeff and the Collector Team