TL;DR1
I.
Imagine that you are a human xeno-sociologist studying the society of insectoid aliens on another planet. To you, they all look alike the same way two ants look alike. Yet when talking about each other, they often use words that translate closely to “beautiful”, “attractive”, “plain” and “ugly”, indicating that they see an aesthetic quality in each other that you don’t see. How do you learn which specific insectoid individuals are attractive and which ones are not?
First thing you could try is to take photos of many aliens – let’s say a hundred of them. Then you make a survey which asks a simple binary question under each photo: “Is this individual attractive? Y/N”. You distribute the survey to a large group of aliens, collect the responses and arrange the photos from the one with least Y’s to the one with the most. You designate the first 10 aliens as “10th percentile rank”, next 10 as “20th”, and so on until the most attractive alien in your sample, who gets “99th”. Now you can compare the ranks of any two aliens in your study and see which one is more attractive than the other.
You proceed to gather data on attractiveness in more elaborate ways, not just with surveys but looking for revealed preference as well - for example, measuring exoskeleton conductance response when presented with different alien faces, or measuring tips that the alien waiters get. Crucially though, two facts hold constant throughout your studies:
Attractiveness is determined by conscious or unconscious reactions of other individuals (binary choice/conductance/tipping)
Attractiveness of an individual can be expressed as a percentile rank among all the other individuals in a study
The data looks nice, but what if you need to check the attractiveness of an alien that’s not in your ranked sample? It’s expensive and cumbersome to run a full study with a large population each time you want to assess whether some specific alien you meet is attractive or not. It would be great to learn how to tell apart beautiful aliens simply by looking at them. Is there a way to do that?
You try to extract a set of features associated with each attractiveness rank. You discover that individuals in top attractiveness ranks have longer antennae and bulging thorax, while lowest-ranked aliens have small claws and large mandibles. With years of experience, you can reliably tell whether the alien is attractive to other aliens just by looking at four body parts. Based on your findings, you publish a four-feature alien attractiveness model, which highly correlates with survey rankings (r = 0.53).
Does this model measure attractiveness though? Whatever it measures, it seems to have very different properties from the attractiveness determined by surveys.
Attractiveness is determined by a set of objective morphometric measurements which were found to most closely correlate with attractiveness data
Attractiveness of an individual can be expressed as a deviation from the “ideal score”, with the most beautiful possible combination of features having a deviation of 0
Anyway, seems like the expedition was a success! After you return back to Earth from your trip, you unexpectedly bump into a stunningly attractive stranger reading your favorite book. After a few words, you get her contact number and invite her on a date. How did you know this stranger was attractive? You didn’t find her photo in a top percentile of a beauty survey. You didn’t run her morphometric data through a beauty-determining algorithm. You simply got some sort of… aesthetic feeling in your mind, not intrinsically tied to any specific quantifiable metric.
It seems like the way you determine attractiveness in your day-to-day life is completely different from what you were studying on an alien planet:
Attractiveness is determined by your brain’s subjective reaction to sensory stimuli it receives
Attractiveness of an individual can be expressed with subjective qualitative statements such as “He makes my heart flutter ” or “She is not my type”
So to sum up, it seems like there are three different sources that can provide information about someone’s attractiveness:
1. Introspection provides instant subjective assessment of attractiveness, but it’s not inherently quantifiable and does not necessarily generalize.
2. Population study aggregates subjective assessments from multiple people in the study population, allowing you to find the most and least attractive people among a small set of samples. It’s however slow and expensive to run, and different study designs will give different results. It also doesn’t tell you anything about people who weren’t in the study sample.
3. Morphometric analysis gives an attractiveness score based on a set of objective criteria (height, waist-to-hip ratio etc). It’s quick once you know what to look for, but even the best models produce middling correlation with study results.
Crucially, these are three different things.
There are people you find attractive that the society in general doesn't. There are people society finds attractive that you don't. There are unattractive people who score highly on objective morphometrics and attractive people who don't.
So, how do we make sense of this mess?
II.
Let’s start by tabooing the word “attractiveness”. Instead, we will use three new words, one for each source of information in our arsenal.
1. Appeal is a subjective feeling you get when looking at an attractive person.
People with high appeal stir feelings of desire inside you. People with low appeal will make you recoil in disgust.
2. Desirability is a rank people get on a study measuring aggregated subjective responses.
People with high desirability produce positive reactions in other people. People with low desirability get negative responses.
3. Perfection is adherence to a set of empirically deduced objective beauty measurements that try to predict survey results based on morphometrics.
People with high perfection have a specific ideal height, facial symmetry, body proportions and muscle definition. People with low perfection deviate from this ideal.
To make an analogy with music, tracks that are not Appealing wouldn’t make it into your playlist, tracks that are not Desirable wouldn’t make it to Top 40, and tracks that are not Perfect probably sound out of tune or have unconventional time signature.
So, how do they stack together? Let me go over each of 8 possible combinations and do my best to imagine what response they would provoke.
High Appeal, high Desirability, high Perfection
This person fits standard beauty norms, and you find them attractive as much as most people do.
Low Appeal, high Desirability, high Perfection
“Too basic for my taste”. This person is widely considered an icon of beauty, but for some reason you're not into the prevailing beauty standard, so you’re not excited to get together with them.
High Appeal, low Desirability, high Perfection
“How come I don't have competition?”. You adore this person, but you can't understand why aren't they more popular with the opposite sex.
High Appeal, high Desirability, low Perfection
"There's something about her, can't quite put a finger on it". An unexplainable charm, a trendmaker defying expectations.
High Appeal, low Desirability, low Perfection
This is an "ugly" person who you fall for anyway. Perhaps you have a very specific and rare turn-on?
Low Appeal, high Desirability, low Perfection
"What does anyone find in him??". This person is a sex symbol for no obvious reason to you.
Low Appeal, low Desirability, high Perfection
"Wow, the character creator tool in this game sucks. Whatever sliders I tweak, it still looks awful". This person is off-putting in a way that's not legible or quantifiable, and it’s hard to pinpoint what needs to be corrected.
Low Appeal, low Desirability, low Perfection
This person is unattractive in a really obvious way, for you as much as for everyone else.
What lessons can we draw from this exercise in combinatorics?
When Appeal and Desirability are the same sign, you have “vanilla” preferences. When they are opposite, you have an “uncommon taste” in people’s appearance.
When Desirability and Perfection are the same sign, the Desirability is easily legible. When they are the opposite signs, it’s “hard to put a finger” on the determinant of the Desirability result.
Machine learning allows for automated high-confidence Desirability measurement, skipping the step of picking factors to construct the legible Perfection model.
Many disagreements about attractiveness result from people using different definitions of “attractiveness”, for example one person talking about Desirability while another talks about Perfection. This can happen even when no one says the word “attractive” at all! People who use a popular 1–10 decile scale refer to any of the three types of attractiveness.
Consider the case when your interlocutor looks at a picture of a celebrity and assigns them a 1-10 number without looking up any data or doing any measurements or calculations. What is going on in their mind?
People who use 1-10 scale as measurement of Appeal can easily replace numerical score with letters such as in tier lists (S=10, A=8-9, B=6-7…). The numbers in this case are simply a made-up score that doesn’t actually correspond to any data and can’t be derived by calculation (similar to 1-10 pain score used in emergency clinic). They would not be vocally disagreeing with people who give completely different scores to the same people - after all, this is not a disagreement of fact, but merely the difference of taste, like if someone enjoys different music than you do.
People who use 1-10 scale as a measurement of Desirability would probably agree that there are as many 10’s as there are 1’s and 5’s out there, or at least that their ranking system can be divided into statistical deciles. When assigning a number, these people essentially do a prediction like “If I asked 10 people whether they would hook up with this person, X% would say yes”, and they count on their experience to predict the “correct” number (which then rarely, if at all, gets tested with actual survey data). These people can say seemingly nonsensical stuff like “She’s a 10, but I’m not into her” and be correct.
People who use 1-10 scale as a measurement of Perfection might base it in something like 90-60-90 cm (36-24-36 in) guideline, despite not having any data on whether people consider this attractive or not. Then any perceived physical “flaws” or deviations would reduce the score. The statements like “She could be a 10, but her shoulders are too broad, and the jaw looks too masculine, so let’s make it an 8” would be an example of this approach.
Coming back to the word “attractiveness”, how to tell what do people talk about when they mention it? Here are a few examples.
Substitute “Attractiveness” with “Appeal” if:
They’re excited about getting romantically involved with a partner
They’re filling a survey about attractiveness
They’re casting actors for their movie to fit their creative vision
Substitute “Attractiveness” with “Desirability” if:
They’re casting actors for their movie to maximize ticket sales
They’re talking about someone being “conventionally attractive”
They’re choosing which photo to put on their dating app profile
Substitute “Attractiveness” with “Perfection” if:
They’re casting a fashion model to showcase designer outfits
They’re a jury member in a bodybuilding contest
They’re filtering a large pool of potential partners by e.g. height
Hopefully this information will be helpful in resolving arguments about attractiveness, so that we can focus on arguing about more important things, like whether anything is ever consensual.
TL;DR: The word “attractiveness” blends three different meanings into one. We can use three different words to be more precise about the meaning we want to convey and to avoid confusion about how objective attractiveness is.
1. Appeal is a subjective feeling you get when looking at e.g. your crush.
Statements like “I don’t find the Hollywood star look attractive” use the word “Attractive” to mean “Appealing”.
2. Desirability is a person’s rank based on responses of other people, expressed as a percentile of population.
Statements like “No point using Tinder unless you’re top 20% attractive” use the word “Attractive” to mean “Desirable”.
3. Perfection is an evaluation according to some set of objective features, such as height, waist-hip ratio, or facial geometry. The set itself is arbitrarily chosen and differs by context.
Statements like “If you’re not attractive, you’ll have harder time finding clothes that fit” use the word “Attractive” to mean “Perfect”.
Wow. I'm shocked to see no one else commented. Very very cool work. Can't wait what other thing you have to offer !