Software Engineering
November 28, 2024

Do college admissions use AI Detectors?

Published on
July 9, 2024
All
Bachelors
Masters
PhD
Statement of Purpose
No, top colleges do NOT use AI detectors. It is well established that AI detectors are unreliable.

A lot of students and parents are worried: Will my essay be flagged as AI? Is it possible to be flagged as AI if I only used ChatGPT for reviewing? etc.

We want to put this discussion to bed, so in this blog, we'll go over the important sources of truth and run some experiments with AI detectors to put your mind at rest on this topic.

Let me start by acknowledging my bias towards AI. I did my Master's in AI and have spent over a third of my life researching and working in this field. I believe AI will be a net positive for humanity. Having addressed that bias, my goal is to be as objective as possible.

What do top colleges say about AI detectors?

  1. MIT: AI detectors don't work
  2. Carnegie Mellon University: Although companies such as Turnitin are beginning to offer AI detection services, none have been established as accurate.
  3. Stanford: AI detectors are biased against non native english speakers. Current detectors are clearly unreliable and easily gamed.
  4. UPenn: The study of 10 million documents concludes that the current detectors are not robust enough to be of significant use in society just yet.
  5. Harvard: It would be inadvisable to count on automated methods for GenAI detection.

Most articles we've found on colleges using AI detectors were from companies which are selling AI checker products.

AI-text detectors suffer from inaccuracy

AI-text detectors are highly inaccurate. With some manipulation, the accuracy goes down as low as 17%. Below, I break leading AI text detectors to illustrate the fragility

The above research paper mentions some techniques:

1. Adding spelling errors and typos:
Instead of: "The quick brown fox jumps over the lazy dog."
Write: "The quikc brown fox jmups over the lazy dog." So it's like we were in a hurry, and we did a quick typing.
2. Writing as a non-native speaker: Ask the LLM to write the text as if you were a non-native speaker of the language.
Instead of: "I am very happy to write this essay for my English class. I hope to get a good grade." Write something like: "I am very happy to writing this essay for my English class. I hope to get good grade."
This adversarial method sought to generate text embodying certain inaccuracies, inappropriate usage, and misunderstandings typical of a NNES possessing a competent yet not advanced level of English proficiency
3. Increase Burstiness:
Instead of: "The sun shone brightly. The birds chirped. A gentle breeze rustled the leaves. It was a perfect day for a picnic in the park."
Write: "The sun shone brightly. Birds chirped. A gentle breeze rustled the leaves, creating a soothing atmosphere. It was a perfect day for a picnic in the park, with family and friends gathered together to enjoy the lovely weather."

Source

For college admission essays, AI text detectors essentially degenerate to being readability checkers.

As of July 2024, I've played with numerous AI text detectors. Essentially, AI text detectors learn what words are most likely to be generated by AI. AI written college essays have some obvious tell tale signs:

  • Overusing certain words and phrases, like "tapestry," "kaleidoscopic," "it's important to note," etc.,
  • Flowery language
  • Long sentences
  • Description over specific details
  • Poor readability (Readability score around 13 or above)

Running an 100% AI generated essay on QuillBot and GPTZero Text Detectors

I generated an personal statement essay using AI I ran it through Quillbot AI detector software (one of the leading paraphrasing tools in the world)

Growing up in a tight-knit Mexican-American community, I have developed a deep appreciation for my cultural heritage. From the vibrant flavors of authentic Mexican cuisine to the colorful celebrations of holidays like Día de los Muertos, my Mexican roots are woven into every aspect of my life.
As a cultural ambassador in my local community, I have had the privilege of sharing my Mexican-American identity with others. I have organized and participated in cultural events and festivals, such as Cinco de Mayo celebrations and Mexican Independence Day parades, to educate and engage the broader community. Additionally, I have taught Spanish language classes to children and adults, helping them develop a deeper understanding and appreciation for the language and its cultural significance. I am also passionate about advocating for diversity and inclusion in my school and community.
My experiences as a cultural ambassador have had a profound impact on my personal growth and development. They have deepened my appreciation for my cultural heritage and identity, helping me to better understand and celebrate the unique aspects of my background. These experiences have also helped me develop strong communication and leadership skills, as I have worked to engage and inspire others to learn about and appreciate Mexican culture. Most importantly, my role as a cultural ambassador has given me a strong sense of pride and belonging in my ethnic community.
Looking ahead, I aspire to use my experiences as a cultural ambassador to make a meaningful impact in the global business world. I plan to pursue a degree in Latin American Studies, which will provide me with the knowledge and skills needed to work as a cultural diversity consultant for multinational corporations. In this role, I will leverage my language skills, cultural knowledge, and leadership experience to help companies effectively navigate cross-cultural communication and collaboration.

It has each one of the red flags mentioned above.

Voila!! Quillbot says it's 100% AI generated, so does GPTZero

Quillbot Text Detector correctly identifies AI text generation
GPTZero Text Detector correctly identifies AI text generation

100% AI-generated college essays often suffer from readability issues. The main point is that it's challenging to read AI-generated content. Do you remember how many times you've effortlessly read through an entire ChatGPT or Gemini response or essay?

We can measure readability using the Hemingway score. Pasting an essay into the GradGPT college essay editor, an AI-generated essay will typically have a high Hemingway score (indicating poor readability) and many red lines on the GradGPT editor.

AI generated college essays would generally have a high hemingway score

It will also fail the writing style of the essay coach.

Writing Style for the essay is weak

Fix the readability and the AI detectors fail:

If you re-prompt ChatGPT/Gemini to rewrite the essay for improved readability, by prompting to modify the essay with:

  • Removing advanced vocabulary and phrasing.
  • Using a more casual tone.
  • Creating shorter sentences and paragraphs.
  • Making the essay easier to read.
ChatGPT response on being prompted for improved readability

ChatGPT returns the following essay:

Growing up in a close Mexican-American community, I learned to love my culture. From the taste of Mexican food to the celebrations like Día de los Muertos, my heritage is a big part of my life.
As a community leader, I share my Mexican-American identity. I help plan events like Cinco de Mayo and Mexican Independence Day parades. I also teach Spanish to kids and adults. This helps others understand and love the language and culture. I care about diversity and inclusion in my school and community.
Being a community leader has changed me a lot. It made me appreciate my heritage more. I understand and celebrate my background better. I’ve also learned to communicate and lead well. My role makes me proud of my ethnic community.
In the future, I want to use my experiences to make a difference in business. I plan to study Latin American Studies. This will help me work as a cultural diversity consultant. In this job, I’ll use my skills and knowledge to help companies work well across cultures.
GPTZero detector wrongly classifies it as human generated text

Quillbot detector wrongly classifies a 100% generated text as human

A Highly Readable but Weak Essay

Essay does not pass the Content Stage

But this 100% AI generated essay would pass all AI detectors with flying colors (as of July 2024)

False Positives are Alarming

Even more alarming are the false positives. Here is a college admission essay written 10 years ago, which was way before transformers existed, let alone large language models.

Wrongly classified false positive

But now you can guess why this happened. This essay has readability issues just like an AI generated essay would.

Hemingway score above 15. What's most interesting is that the parts highlighted by AI correspond to the unreadable sections of the essay

Essays with overtly complex language and poor readability can be flagged as AI generated
  • AI detectors suffer from accuracy issues
  • AI detectors often disagree on whether a given text is AI-generated or not, and can flip-flop on their assessments when the same text is reanalyzed.
  • AI detectors may misinterpret elements of good human writing as signs of AI generation, such as clear structure and informative, objective tone.
  • The accuracy of identifying AI generated text is only going to get more tough with advanced models like GPT-4.

So, what should you focus on when writing your essay?

Write an essay with substance. Fluff does not work.

Write an essay without inconsistencies.

Write an essay that is authentic to who you are. Someone who knows you should feel like it reflects the real you.

Write an essay that's easy to read. Complex writing doesn't help anyone.

While AI can guide, co-pilot, or coach you in crafting and perfecting your essay, the substance, content, and arguments must ultimately come from your own thoughts and experiences.