Case study: Hello, Cookie!

Hello, Cookie!
feat. Boris (1) + Sendhil (students, interfaces)
feat. Sendhil + team : schemas, roadmap

Overview

Goal, participants, purpose

Participants:

A group gathering cookie recipes + turning them into collections
A set of cookie data-sources (from old books, from people)

Goals:

Identify users and use cases to get feedback + measure progress
Structure schemas for each, define what parts of their use will touch R1 and other tools
Identify similar approaches for comparison + benchmarking

Atoms:
- Ingredients
- Measures
- Tools
- Actions
Derived atoms:
- Substitutes (equivalence classes of ingredients)
- Definitions (aliases; concepts — "what is a chocolate chip cookie?”)
- Categories (folksonomy of tags applied to the above)
Recipes:
- Recipe steps: combinations of atoms
- Traditional recipes - sequences of steps + time + free description
- Parametric recipes - models: tunable parameters + outcomes
Types of categories:
- by input
- by output
- by materials
- related filters
  - filter by what is available
  - fuzzy filtering w/ substitutes

Cookie book: 10 types, 100 recipes, 300 modules
Overview: atlas/feature-universe
- find recipes by feature,
- find a specific recipe
- stats on top recipe requests
Recipe sharing
- submit variant or new recipe
- comment/review

Data sources

Data Catalog
- Index of sources from books, sites, scrapers
- Find a maintainer for each, ask for more uniform provenance + metadata
Glossaries and terms
Existing recipes
- Mining old cookbooks (OCR + NLP)
- Scrapable websites: often limited structure + persistence
Adjacent datasets :
- In culture: Related music/art/books
- For access: ingredient cost, nutrition, accessibility
- For delight: taste graphs
Meta datasets: Food ontology: foodkg

Process

Creating schemas to match a set of sources
Initial data entry by enthusiasts
- Start by hand, with a spreadsheet for each schema / data source.
Chef League: Major choices that guide taste prefs (meat v veg, sugar or no), modeling exceptions (binaries for allergies, cilantros, durian)

Proposed targets

Collections w/ cookie recipes, definitions, and other data
- Examples large (a cookbook) and small (all about one recipe)
Initial data others can query, view, ingest. (by October)
Articulate a contribution flow: how recipe writers, makers, testers add to collections

To incorporate

~ existing texts (Myhrvold team: scan+ocr; earlier work)
~ existing structured data : nutrition; diet substitutes; RDA
~~ existing ontologies: FoodOn, WD, FB. cuisines, KG connections
~ specialist data: cooking time; preservation-time in a fridge;
~ chef-specialties: techniques, reference works, taste clusters
~ categories: common classifiers (pescatarian, Atkins, Moroccan)
~ schemas: find/make a place to store + version schemas for food/cooking/recipes

Case study: Hello, Cookie!

Overview

Elements of a cookie recipe

Sharing recipes

Data sources

Proposed targets

To incorporate