To ensure that we are unearthing comments relevant to the specific region being studied, Lux Research uses a three-part methodology.
1. Leveraging Location Metadata
Some digital platforms embed metadata that can pinpoint the country where a piece of content originates. Examples include:
- IP-derived location: When available, we use anonymized data based on IP addresses. This indicates a user’s region at the city, state, or country level, without revealing personal identifiers.
- User-provided location: Some people voluntarily list their location in their online profiles
We can use this metadata to create a language map. Think of it as a geospatial blueprint showing how language appears in each region.
By marrying the location metadata with the broader conversation, we start understanding how regional culture, climate, and events color people’s perspectives.
2. Underlying Ethnographic Research
Even after considering metadata, there can be ambiguity—especially in multicultural or highly mobile societies like the U.S. This is where our underlying ethnographic research comes into play. Our technology uses our thousands of ethnographies to detect patterns associated with specific geo-regions.
3. Discourse Markers
Metadata-based clues are powerful, but sometimes partial or missing. Enter discourse markers—linguistic features that help us identify or confirm location. For some examples:
- Spelling variants: In U.S. English, “color” or “favorite” is common, whereas British English would be “colour” or “favourite.”
- Regional expressions: Americans might say “cookie,” whereas a British English speaker would say “biscuit.”
- Local references: Comments about region-specific entertainment services, banking organizations, etc.
Bringing It All Together
In practice, these three steps—metadata analysis, discourse markers, and ethnographic context—work together dynamically:
- Metadata shows us where someone might be commenting from.
- Ethnographic research locates conversations around the belief systems our anthropologists have identified in each region.
- Discourse markers refine our ability to model where a language speaker is likely from, absent metadata.