The Writing Center • University of North Carolina at Chapel Hill

What this handout is about

This handout will help you understand how paragraphs are formed, how to develop stronger paragraphs, and how to completely and clearly express your ideas.

What is a paragraph?

Paragraphs are the building blocks of papers. Many students define paragraphs in terms of length: a paragraph is a group of at least five sentences, a paragraph is half a page long, etc. In reality, though, the unity and coherence of ideas among sentences is what constitutes a paragraph. A paragraph is defined as “a group of sentences or a single sentence that forms a unit” (Lunsford and Connors 116). Length and appearance do not determine whether a section in a paper is a paragraph. For instance, in some styles of writing, particularly journalistic styles, a paragraph can be just one sentence long. Ultimately, a paragraph is a sentence or group of sentences that support one main idea. In this handout, we will refer to this as the “controlling idea,” because it controls what happens in the rest of the paragraph.

How do I decide what to put in a paragraph?

Before you can begin to determine what the composition of a particular paragraph will be, you must first decide on an argument and a working thesis statement for your paper. What is the most important idea that you are trying to convey to your reader? The information in each paragraph must be related to that idea. In other words, your paragraphs should remind your reader that there is a recurrent relationship between your thesis and the information in each paragraph. A working thesis functions like a seed from which your paper, and your ideas, will grow. The whole process is an organic one—a natural progression from a seed to a full-blown paper where there are direct, familial relationships between all of the ideas in the paper.

The decision about what to put into your paragraphs begins with the germination of a seed of ideas; this “germination process” is better known as brainstorming . There are many techniques for brainstorming; whichever one you choose, this stage of paragraph development cannot be skipped. Building paragraphs can be like building a skyscraper: there must be a well-planned foundation that supports what you are building. Any cracks, inconsistencies, or other corruptions of the foundation can cause your whole paper to crumble.

So, let’s suppose that you have done some brainstorming to develop your thesis. What else should you keep in mind as you begin to create paragraphs? Every paragraph in a paper should be :

  • Unified : All of the sentences in a single paragraph should be related to a single controlling idea (often expressed in the topic sentence of the paragraph).
  • Clearly related to the thesis : The sentences should all refer to the central idea, or thesis, of the paper (Rosen and Behrens 119).
  • Coherent : The sentences should be arranged in a logical manner and should follow a definite plan for development (Rosen and Behrens 119).
  • Well-developed : Every idea discussed in the paragraph should be adequately explained and supported through evidence and details that work together to explain the paragraph’s controlling idea (Rosen and Behrens 119).

How do I organize a paragraph?

There are many different ways to organize a paragraph. The organization you choose will depend on the controlling idea of the paragraph. Below are a few possibilities for organization, with links to brief examples:

  • Narration : Tell a story. Go chronologically, from start to finish. ( See an example. )
  • Description : Provide specific details about what something looks, smells, tastes, sounds, or feels like. Organize spatially, in order of appearance, or by topic. ( See an example. )
  • Process : Explain how something works, step by step. Perhaps follow a sequence—first, second, third. ( See an example. )
  • Classification : Separate into groups or explain the various parts of a topic. ( See an example. )
  • Illustration : Give examples and explain how those examples support your point. (See an example in the 5-step process below.)

Illustration paragraph: a 5-step example

From the list above, let’s choose “illustration” as our rhetorical purpose. We’ll walk through a 5-step process for building a paragraph that illustrates a point in an argument. For each step there is an explanation and example. Our example paragraph will be about human misconceptions of piranhas.

Step 1. Decide on a controlling idea and create a topic sentence

Paragraph development begins with the formulation of the controlling idea. This idea directs the paragraph’s development. Often, the controlling idea of a paragraph will appear in the form of a topic sentence. In some cases, you may need more than one sentence to express a paragraph’s controlling idea.

Controlling idea and topic sentence — Despite the fact that piranhas are relatively harmless, many people continue to believe the pervasive myth that piranhas are dangerous to humans.

Step 2. Elaborate on the controlling idea

Paragraph development continues with an elaboration on the controlling idea, perhaps with an explanation, implication, or statement about significance. Our example offers a possible explanation for the pervasiveness of the myth.

Elaboration — This impression of piranhas is exacerbated by their mischaracterization in popular media.

Step 3. Give an example (or multiple examples)

Paragraph development progresses with an example (or more) that illustrates the claims made in the previous sentences.

Example — For example, the promotional poster for the 1978 horror film Piranha features an oversized piranha poised to bite the leg of an unsuspecting woman.

Step 4. Explain the example(s)

The next movement in paragraph development is an explanation of each example and its relevance to the topic sentence. The explanation should demonstrate the value of the example as evidence to support the major claim, or focus, in your paragraph.

Continue the pattern of giving examples and explaining them until all points/examples that the writer deems necessary have been made and explained. NONE of your examples should be left unexplained. You might be able to explain the relationship between the example and the topic sentence in the same sentence which introduced the example. More often, however, you will need to explain that relationship in a separate sentence.

Explanation for example — Such a terrifying representation easily captures the imagination and promotes unnecessary fear.

Notice that the example and explanation steps of this 5-step process (steps 3 and 4) can be repeated as needed. The idea is that you continue to use this pattern until you have completely developed the main idea of the paragraph.

Step 5. Complete the paragraph’s idea or transition into the next paragraph

The final movement in paragraph development involves tying up the loose ends of the paragraph. At this point, you can remind your reader about the relevance of the information to the larger paper, or you can make a concluding point for this example. You might, however, simply transition to the next paragraph.

Sentences for completing a paragraph — While the trope of the man-eating piranhas lends excitement to the adventure stories, it bears little resemblance to the real-life piranha. By paying more attention to fact than fiction, humans may finally be able to let go of this inaccurate belief.

Finished paragraph

Despite the fact that piranhas are relatively harmless, many people continue to believe the pervasive myth that piranhas are dangerous to humans. This impression of piranhas is exacerbated by their mischaracterization in popular media. For example, the promotional poster for the 1978 horror film Piranha features an oversized piranha poised to bite the leg of an unsuspecting woman. Such a terrifying representation easily captures the imagination and promotes unnecessary fear. While the trope of the man-eating piranhas lends excitement to the adventure stories, it bears little resemblance to the real-life piranha. By paying more attention to fact than fiction, humans may finally be able to let go of this inaccurate belief.

Troubleshooting paragraphs

Problem: the paragraph has no topic sentence.

Imagine each paragraph as a sandwich. The real content of the sandwich—the meat or other filling—is in the middle. It includes all the evidence you need to make the point. But it gets kind of messy to eat a sandwich without any bread. Your readers don’t know what to do with all the evidence you’ve given them. So, the top slice of bread (the first sentence of the paragraph) explains the topic (or controlling idea) of the paragraph. And, the bottom slice (the last sentence of the paragraph) tells the reader how the paragraph relates to the broader argument. In the original and revised paragraphs below, notice how a topic sentence expressing the controlling idea tells the reader the point of all the evidence.

Original paragraph

Piranhas rarely feed on large animals; they eat smaller fish and aquatic plants. When confronted with humans, piranhas’ first instinct is to flee, not attack. Their fear of humans makes sense. Far more piranhas are eaten by people than people are eaten by piranhas. If the fish are well-fed, they won’t bite humans.

Revised paragraph

Although most people consider piranhas to be quite dangerous, they are, for the most part, entirely harmless. Piranhas rarely feed on large animals; they eat smaller fish and aquatic plants. When confronted with humans, piranhas’ first instinct is to flee, not attack. Their fear of humans makes sense. Far more piranhas are eaten by people than people are eaten by piranhas. If the fish are well-fed, they won’t bite humans.

Once you have mastered the use of topic sentences, you may decide that the topic sentence for a particular paragraph really shouldn’t be the first sentence of the paragraph. This is fine—the topic sentence can actually go at the beginning, middle, or end of a paragraph; what’s important is that it is in there somewhere so that readers know what the main idea of the paragraph is and how it relates back to the thesis of your paper. Suppose that we wanted to start the piranha paragraph with a transition sentence—something that reminds the reader of what happened in the previous paragraph—rather than with the topic sentence. Let’s suppose that the previous paragraph was about all kinds of animals that people are afraid of, like sharks, snakes, and spiders. Our paragraph might look like this (the topic sentence is bold):

Like sharks, snakes, and spiders, piranhas are widely feared. Although most people consider piranhas to be quite dangerous, they are, for the most part, entirely harmless . Piranhas rarely feed on large animals; they eat smaller fish and aquatic plants. When confronted with humans, piranhas’ first instinct is to flee, not attack. Their fear of humans makes sense. Far more piranhas are eaten by people than people are eaten by piranhas. If the fish are well-fed, they won’t bite humans.

Problem: the paragraph has more than one controlling idea

If a paragraph has more than one main idea, consider eliminating sentences that relate to the second idea, or split the paragraph into two or more paragraphs, each with only one main idea. Watch our short video on reverse outlining to learn a quick way to test whether your paragraphs are unified. In the following paragraph, the final two sentences branch off into a different topic; so, the revised paragraph eliminates them and concludes with a sentence that reminds the reader of the paragraph’s main idea.

Although most people consider piranhas to be quite dangerous, they are, for the most part, entirely harmless. Piranhas rarely feed on large animals; they eat smaller fish and aquatic plants. When confronted with humans, piranhas’ first instinct is to flee, not attack. Their fear of humans makes sense. Far more piranhas are eaten by people than people are eaten by piranhas. A number of South American groups eat piranhas. They fry or grill the fish and then serve them with coconut milk or tucupi, a sauce made from fermented manioc juices.

Problem: transitions are needed within the paragraph

You are probably familiar with the idea that transitions may be needed between paragraphs or sections in a paper (see our handout on transitions ). Sometimes they are also helpful within the body of a single paragraph. Within a paragraph, transitions are often single words or short phrases that help to establish relationships between ideas and to create a logical progression of those ideas in a paragraph. This is especially likely to be true within paragraphs that discuss multiple examples. Let’s take a look at a version of our piranha paragraph that uses transitions to orient the reader:

Although most people consider piranhas to be quite dangerous, they are, except in two main situations, entirely harmless. Piranhas rarely feed on large animals; they eat smaller fish and aquatic plants. When confronted with humans, piranhas’ instinct is to flee, not attack. But there are two situations in which a piranha bite is likely. The first is when a frightened piranha is lifted out of the water—for example, if it has been caught in a fishing net. The second is when the water level in pools where piranhas are living falls too low. A large number of fish may be trapped in a single pool, and if they are hungry, they may attack anything that enters the water.

In this example, you can see how the phrases “the first” and “the second” help the reader follow the organization of the ideas in the paragraph.

Works consulted

We consulted these works while writing this handout. This is not a comprehensive list of resources on the handout’s topic, and we encourage you to do your own research to find additional publications. Please do not use this list as a model for the format of your own reference list, as it may not match the citation style you are using. For guidance on formatting citations, please see the UNC Libraries citation tutorial . We revise these tips periodically and welcome feedback.

Lunsford, Andrea. 2008. The St. Martin’s Handbook: Annotated Instructor’s Edition , 6th ed. New York: St. Martin’s.

Rosen, Leonard J., and Laurence Behrens. 2003. The Allyn & Bacon Handbook , 5th ed. New York: Longman.

You may reproduce it for non-commercial use if you use the entire handout and attribute the source: The Writing Center, University of North Carolina at Chapel Hill

Make a Gift

helpful professor logo

11 Rules for Essay Paragraph Structure (with Examples)

How do you structure a paragraph in an essay?

If you’re like the majority of my students, you might be getting your basic essay paragraph structure wrong and getting lower grades than you could!

In this article, I outline the 11 key steps to writing a perfect paragraph. But, this isn’t your normal ‘how to write an essay’ article. Rather, I’ll try to give you some insight into exactly what teachers look out for when they’re grading essays and figuring out what grade to give them.

You can navigate each issue below, or scroll down to read them all:

1. Paragraphs must be at least four sentences long 2. But, at most seven sentences long 3. Your paragraph must be Left-Aligned 4. You need a topic sentence 5 . Next, you need an explanation sentence 6. You need to include an example 7. You need to include citations 8. All paragraphs need to be relevant to the marking criteria 9. Only include one key idea per paragraph 10. Keep sentences short 11. Keep quotes short

Paragraph structure is one of the most important elements of getting essay writing right .

As I cover in my Ultimate Guide to Writing an Essay Plan , paragraphs are the heart and soul of your essay.

However, I find most of my students have either:

  • forgotten how to write paragraphs properly,
  • gotten lazy, or
  • never learned it in the first place!

Paragraphs in essay writing are different from paragraphs in other written genres .

In fact, the paragraphs that you are reading now would not help your grades in an essay.

That’s because I’m writing in journalistic style, where paragraph conventions are vastly different.

For those of you coming from journalism or creative writing, you might find you need to re-learn paragraph writing if you want to write well-structured essay paragraphs to get top grades.

Below are eleven reasons your paragraphs are losing marks, and what to do about it!

11 tips for perfect paragraphs

Essay Paragraph Structure Rules

1. your paragraphs must be at least 4 sentences long.

In journalism and blog writing, a one-sentence paragraph is great. It’s short, to-the-point, and helps guide your reader. For essay paragraph structure, one-sentence paragraphs suck.

A one-sentence essay paragraph sends an instant signal to your teacher that you don’t have much to say on an issue.

A short paragraph signifies that you know something – but not much about it. A one-sentence paragraph lacks detail, depth and insight.

Many students come to me and ask, “what does ‘add depth’ mean?” It’s one of the most common pieces of feedback you’ll see written on the margins of your essay.

Personally, I think ‘add depth’ is bad feedback because it’s a short and vague comment. But, here’s what it means: You’ve not explained your point enough!

If you’re writing one-, two- or three-sentence essay paragraphs, you’re costing yourself marks.

Always aim for at least four sentences per paragraph in your essays.

This doesn’t mean that you should add ‘fluff’ or ‘padding’ sentences.

Make sure you don’t:

a) repeat what you said in different words, or b) write something just because you need another sentence in there.

But, you need to do some research and find something insightful to add to that two-sentence paragraph if you want to ace your essay.

Check out Points 5 and 6 for some advice on what to add to that short paragraph to add ‘depth’ to your paragraph and start moving to the top of the class.

  • How to Make an Essay Longer
  • How to Make an Essay Shorter

2. Your Paragraphs must not be more than 7 Sentences Long

Okay, so I just told you to aim for at least four sentences per paragraph. So, what’s the longest your paragraph should be?

Seven sentences. That’s a maximum.

So, here’s the rule:

Between four and seven sentences is the sweet spot that you need to aim for in every single paragraph.

Here’s why your paragraphs shouldn’t be longer than seven sentences:

1. It shows you can organize your thoughts. You need to show your teacher that you’ve broken up your key ideas into manageable segments of text (see point 10)

2. It makes your work easier to read.   You need your writing to be easily readable to make it easy for your teacher to give you good grades. Make your essay easy to read and you’ll get higher marks every time.

One of the most important ways you can make your work easier to read is by writing paragraphs that are less than six sentences long.

3. It prevents teacher frustration. Teachers are just like you. When they see a big block of text their eyes glaze over. They get frustrated, lost, their mind wanders … and you lose marks.

To prevent teacher frustration, you need to ensure there’s plenty of white space in your essay. It’s about showing them that the piece is clearly structured into one key idea per ‘chunk’ of text.

Often, you might find that your writing contains tautologies and other turns of phrase that can be shortened for clarity.

3. Your Paragraph must be Left-Aligned

Turn off ‘Justified’ text and: Never. Turn. It. On. Again.

Justified text is where the words are stretched out to make the paragraph look like a square. It turns the writing into a block. Don’t do it. You will lose marks, I promise you! Win the psychological game with your teacher: left-align your text.

A good essay paragraph is never ‘justified’.

I’m going to repeat this, because it’s important: to prevent your essay from looking like a big block of muddy, hard-to-read text align your text to the left margin only.

You want white space on your page – and lots of it. White space helps your reader scan through your work. It also prevents it from looking like big blocks of text.

You want your reader reading vertically as much as possible: scanning, browsing, and quickly looking through for evidence you’ve engaged with the big ideas.

The justified text doesn’t help you do that. Justified text makes your writing look like a big, lumpy block of text that your reader doesn’t want to read.

What’s wrong with Center-Aligned Text?

While I’m at it, never, ever, center-align your text either. Center-aligned text is impossible to skim-read. Your teacher wants to be able to quickly scan down the left margin to get the headline information in your paragraph.

Not many people center-align text, but it’s worth repeating: never, ever center-align your essays.

an infographic showing that left-aligned paragraphs are easy to read. The infographic recommends using Control plus L on a PC keyboard or Command plus L on a Mac to left align a paragraph

Don’t annoy your reader. Left align your text.

4. Your paragraphs must have a Topic Sentence

The first sentence of an essay paragraph is called the topic sentence. This is one of the most important sentences in the correct essay paragraph structure style.

The topic sentence should convey exactly what key idea you’re going to cover in your paragraph.

Too often, students don’t let their reader know what the key idea of the paragraph is until several sentences in.

You must show what the paragraph is about in the first sentence.

You never, ever want to keep your reader in suspense. Essays are not like creative writing. Tell them straight away what the paragraph is about. In fact, if you can, do it in the first half of the first sentence .

I’ll remind you again: make it easy to grade your work. Your teacher is reading through your work trying to determine what grade to give you. They’re probably going to mark 20 assignments in one sitting. They have no interest in storytelling or creativity. They just want to know how much you know! State what the paragraph is about immediately and move on.

Suggested: Best Words to Start a Paragraph

Ideal Essay Paragraph Structure Example: Writing a Topic Sentence If your paragraph is about how climate change is endangering polar bears, say it immediately : “Climate change is endangering polar bears.” should be your first sentence in your paragraph. Take a look at first sentence of each of the four paragraphs above this one. You can see from the first sentence of each paragraph that the paragraphs discuss:

When editing your work, read each paragraph and try to distil what the one key idea is in your paragraph. Ensure that this key idea is mentioned in the first sentence .

(Note: if there’s more than one key idea in the paragraph, you may have a problem. See Point 9 below .)

The topic sentence is the most important sentence for getting your essay paragraph structure right. So, get your topic sentences right and you’re on the right track to a good essay paragraph.

5. You need an Explanation Sentence

All topic sentences need a follow-up explanation. The very first point on this page was that too often students write paragraphs that are too short. To add what is called ‘depth’ to a paragraph, you can come up with two types of follow-up sentences: explanations and examples.

Let’s take explanation sentences first.

Explanation sentences give additional detail. They often provide one of the following services:

Let’s go back to our example of a paragraph on Climate change endangering polar bears. If your topic sentence is “Climate change is endangering polar bears.”, then your follow-up explanation sentence is likely to explain how, why, where, or when. You could say:

Ideal Essay Paragraph Structure Example: Writing Explanation Sentences 1. How: “The warming atmosphere is melting the polar ice caps.” 2. Why: “The polar bears’ habitats are shrinking every single year.” 3. Where: “This is happening in the Antarctic ice caps near Greenland.” 4. When: “Scientists first noticed the ice caps were shrinking in 1978.”

You don’t have to provide all four of these options each time.

But, if you’re struggling to think of what to add to your paragraph to add depth, consider one of these four options for a good quality explanation sentence.


6. Your need to Include an Example

Examples matter! They add detail. They also help to show that you genuinely understand the issue. They show that you don’t just understand a concept in the abstract; you also understand how things work in real life.

Example sentences have the added benefit of personalising an issue. For example, after saying “Polar bears’ habitats are shrinking”, you could note specific habitats, facts and figures, or even a specific story about a bear who was impacted.

Ideal Essay Paragraph Structure Example: Writing an ‘Example’ Sentence “For example, 770,000 square miles of Arctic Sea Ice has melted in the past four decades, leading Polar Bear populations to dwindle ( National Geographic, 2018 )

In fact, one of the most effective politicians of our times – Barrack Obama – was an expert at this technique. He would often provide examples of people who got sick because they didn’t have healthcare to sell Obamacare.

What effect did this have? It showed the real-world impact of his ideas. It humanised him, and got him elected president – twice!

Be like Obama. Provide examples. Often.

7. All Paragraphs need Citations

Provide a reference to an academic source in every single body paragraph in the essay. The only two paragraphs where you don’t need a reference is the introduction and conclusion .

Let me repeat: Paragraphs need at least one reference to a quality scholarly source .

Let me go even further:

Students who get the best marks provide two references to two different academic sources in every paragraph.

Two references in a paragraph show you’ve read widely, cross-checked your sources, and given the paragraph real thought.

It’s really important that these references link to academic sources, not random websites, blogs or YouTube videos. Check out our Seven Best types of Sources to Cite in Essays post to get advice on what sources to cite. Number 6 w ill surprise you!

Ideal Essay Paragraph Structure Example: In-Text Referencing in Paragraphs Usually, in-text referencing takes the format: (Author, YEAR), but check your school’s referencing formatting requirements carefully. The ‘Author’ section is the author’s last name only. Not their initials. Not their first name. Just their last name . My name is Chris Drew. First name Chris, last name Drew. If you were going to reference an academic article I wrote in 2019, you would reference it like this: (Drew, 2019).

Where do you place those two references?

Place the first reference at the end of the first half of the paragraph. Place the second reference at the end of the second half of the paragraph.

This spreads the references out and makes it look like all the points throughout the paragraph are backed up by your sources. The goal is to make it look like you’ve reference regularly when your teacher scans through your work.

Remember, teachers can look out for signposts that indicate you’ve followed academic conventions and mentioned the right key ideas.

Spreading your referencing through the paragraph helps to make it look like you’ve followed the academic convention of referencing sources regularly.

Here are some examples of how to reference twice in a paragraph:

  • If your paragraph was six sentences long, you would place your first reference at the end of the third sentence and your second reference at the end of the sixth sentence.
  • If your paragraph was five sentences long, I would recommend placing one at the end of the second sentence and one at the end of the fifth sentence.

You’ve just read one of the key secrets to winning top marks.

8. Every Paragraph must be relevant to the Marking Criteria

Every paragraph must win you marks. When you’re editing your work, check through the piece to see if every paragraph is relevant to the marking criteria.

For the British: In the British university system (I’m including Australia and New Zealand here – I’ve taught at universities in all three countries), you’ll usually have a ‘marking criteria’. It’s usually a list of between two and six key learning outcomes your teacher needs to use to come up with your score. Sometimes it’s called a:

  • Marking criteria
  • Marking rubric
  • (Key) learning outcome
  • Indicative content

Check your assignment guidance to see if this is present. If so, use this list of learning outcomes to guide what you write. If your paragraphs are irrelevant to these key points, delete the paragraph .

Paragraphs that don’t link to the marking criteria are pointless. They won’t win you marks.

For the Americans: If you don’t have a marking criteria / rubric / outcomes list, you’ll need to stick closely to the essay question or topic. This goes out to those of you in the North American system. North America (including USA and Canada here) is often less structured and the professor might just give you a topic to base your essay on.

If all you’ve got is the essay question / topic, go through each paragraph and make sure each paragraph is relevant to the topic.

For example, if your essay question / topic is on “The Effects of Climate Change on Polar Bears”,

  • Don’t talk about anything that doesn’t have some connection to climate change and polar bears;
  • Don’t talk about the environmental impact of oil spills in the Gulf of Carpentaria;
  • Don’t talk about black bear habitats in British Columbia.
  • Do talk about the effects of climate change on polar bears (and relevant related topics) in every single paragraph .

You may think ‘stay relevant’ is obvious advice, but at least 20% of all essays I mark go off on tangents and waste words.

Stay on topic in Every. Single. Paragraph. If you want to learn more about how to stay on topic, check out our essay planning guide .

9. Only have one Key Idea per Paragraph

One key idea for each paragraph. One key idea for each paragraph. One key idea for each paragraph.

Don’t forget!

Too often, a student starts a paragraph talking about one thing and ends it talking about something totally different. Don’t be that student.

To ensure you’re focussing on one key idea in your paragraph, make sure you know what that key idea is. It should be mentioned in your topic sentence (see Point 3 ). Every other sentence in the paragraph adds depth to that one key idea.

If you’ve got sentences in your paragraph that are not relevant to the key idea in the paragraph, they don’t fit. They belong in another paragraph.

Go through all your paragraphs when editing your work and check to see if you’ve veered away from your paragraph’s key idea. If so, you might have two or even three key ideas in the one paragraph.

You’re going to have to get those additional key ideas, rip them out, and give them paragraphs of their own.

If you have more than one key idea in a paragraph you will lose marks. I promise you that.

The paragraphs will be too hard to read, your reader will get bogged down reading rather than scanning, and you’ll have lost grades.

10. Keep Sentences Short

If a sentence is too long it gets confusing. When the sentence is confusing, your reader will stop reading your work. They will stop reading the paragraph and move to the next one. They’ll have given up on your paragraph.

Short, snappy sentences are best.

Shorter sentences are easier to read and they make more sense. Too often, students think they have to use big, long, academic words to get the best marks. Wrong. Aim for clarity in every sentence in the paragraph. Your teacher will thank you for it.

The students who get the best marks write clear, short sentences.

When editing your draft, go through your essay and see if you can shorten your longest five sentences.

(To learn more about how to write the best quality sentences, see our page on Seven ways to Write Amazing Sentences .)

11. Keep Quotes Short

Eighty percent of university teachers hate quotes. That’s not an official figure. It’s my guestimate based on my many interactions in faculty lounges. Twenty percent don’t mind them, but chances are your teacher is one of the eight out of ten who hate quotes.

Teachers tend to be turned off by quotes because it makes it look like you don’t know how to say something on your own words.

Now that I’ve warned you, here’s how to use quotes properly:

Ideal Essay Paragraph Structure Example: How To Use Quotes in University-Level Essay Paragraphs 1. Your quote should be less than one sentence long. 2. Your quote should be less than one sentence long. 3. You should never start a sentence with a quote. 4. You should never end a paragraph with a quote. 5 . You should never use more than five quotes per essay. 6. Your quote should never be longer than one line in a paragraph.

The minute your teacher sees that your quote takes up a large chunk of your paragraph, you’ll have lost marks.

Your teacher will circle the quote, write a snarky comment in the margin, and not even bother to give you points for the key idea in the paragraph.

Avoid quotes, but if you really want to use them, follow those five rules above.

I’ve also provided additional pages outlining Seven tips on how to use Quotes if you want to delve deeper into how, when and where to use quotes in essays. Be warned: quoting in essays is harder than you thought.

The basic essay paragraph structure formula includes: 4-6 sentence paragraphs; a clear topic sentence; useful explanations and examples; a focus on one key idea only; and references to two different academic sources.

Follow the advice above and you’ll be well on your way to getting top marks at university.

Writing essay paragraphs that are well structured takes time and practice. Don’t be too hard on yourself and keep on trying!

Below is a summary of our 11 key mistakes for structuring essay paragraphs and tips on how to avoid them.

I’ve also provided an easy-to-share infographic below that you can share on your favorite social networking site. Please share it if this article has helped you out!

11 Biggest Essay Paragraph Structure Mistakes you’re probably Making

1.  Your paragraphs are too short 2.  Your paragraphs are too long 3.  Your paragraph alignment is ‘Justified’ 4.  Your paragraphs are missing a topic sentence 5 .  Your paragraphs are missing an explanation sentence 6.  Your paragraphs are missing an example 7.  Your paragraphs are missing references 8.  Your paragraphs are not relevant to the marking criteria 9.  You’re trying to fit too many ideas into the one paragraph 10.  Your sentences are too long 11.  Your quotes are too long


Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) 15 Self-Actualization Examples (Maslow's Hierarchy)
  • Chris Drew (PhD) Forest Schools Philosophy & Curriculum, Explained!
  • Chris Drew (PhD) Montessori's 4 Planes of Development, Explained!
  • Chris Drew (PhD) Montessori vs Reggio Emilia vs Steiner-Waldorf vs Froebel

4 thoughts on “11 Rules for Essay Paragraph Structure (with Examples)”

' src=

Hello there. I noticed that throughout this article on Essay Writing, you keep on saying that the teacher won’t have time to go through the entire essay. Don’t you think this is a bit discouraging that with all the hard work and time put into your writing, to know that the teacher will not read through the entire paper?

' src=

Hi Clarence,

Thanks so much for your comment! I love to hear from readers on their thoughts.

Yes, I agree that it’s incredibly disheartening.

But, I also think students would appreciate hearing the truth.

Behind closed doors many / most university teachers are very open about the fact they ‘only have time to skim-read papers’. They regularly bring this up during heated faculty meetings about contract negotiations! I.e. in one university I worked at, we were allocated 45 minutes per 10,000 words – that’s just over 4 minutes per 1,000 word essay, and that’d include writing the feedback, too!

If students know the truth, they can better write their essays in a way that will get across the key points even from a ‘skim-read’.

I hope to write candidly on this website – i.e. some of this info will never be written on university blogs because universities want to hide these unfortunate truths from students.

Thanks so much for stopping by!

Regards, Chris

' src=

This is wonderful and helpful, all I say is thank you very much. Because I learned a lot from this site, own by chris thank you Sir.

' src=

Thank you. This helped a lot.

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

How Many Sentences are in a Paragraph?

The answer to this question depends on the type of writing you do. Are you planning to write an academic paper, a novel, a blog post, or something else entirely? Each format comes with its own set of standards. So, to cut to the chase, here’s how Merriam-Webster defines a paragraph : 

“A subdivision of a written composition that consists of one or more sentences, deals with one point or gives the words of one speaker, and begins on a new usually indented line.”

So, by this definition, the length of a paragraph could be one sentence. Alternatively, you could compose a paragraph of infinite length as long as you cover only one point within it. Some authors have taken this idea to an absurd extreme, creating whole novels out of a single paragraph. For example, David Albahari’s novel Leeches consists of one long paragraph—proving that one paragraph can be as long as 300 pages. Based on the Merriam Webster definition, there’s no limit to the number of sentences in a paragraph.

Of course, by including logical paragraph breaks, you help the reader understand your arguments more clearly. Whether you’re writing short paragraphs or long ones, be sure to separate your central ideas from one another. As a general rule, essays should have an introduction paragraph and a conclusion. Within an essay, you might include any number of body paragraphs that cover different topics.

Your writing, at its best

Compose bold, clear, mistake-free, writing with Grammarly's AI-powered writing assistant

Standard Essay Structure

For high school papers, many teachers expect to see an essay structure that follows a particular formula. The standard essay structure consists of at least five paragraphs—an introduction, a conclusion, and three supporting paragraphs.

In order to show that you’ve mastered the standard essay structure, you should include 3-5 sentences in each paragraph. Begin most of your paragraphs with a transitional idea that connects the new paragraph to the one that came before. This sentence also introduces the main idea of the paragraph. Next, include 1-3 supporting sentences that build on your idea. Lastly, write a concluding sentence to drive your idea home. The final sentence in the paragraph may also prepare the reader for a clever transition at the beginning of your next paragraph. 

For the introductory paragraph of an essay, the rules are a bit different. Since you don’t need to transition from a previous idea, you can grab the reader’s attention with the first sentence or two. By the last sentence, you should conclude your introductory paragraph with the thesis statement of your essay. The thesis serves as a topic sentence, giving your reader a roadmap for what you hope to prove over the course of your essay. Not only does this sentence introduce your main points, it also provides a preview of what you’ll be saying in the conclusion of your essay. 

Research Papers

What makes a professional research paper different from high school paper? For one thing, academic papers often include longer paragraphs, jam-packed with information. Writers must form coherent paragraphs, giving papers introductions, conclusions, and supporting arguments, all while maintaining a formal writing style and including dense technical information. You probably won’t see conversational language or silly attention-grabbers in an academic text. Instead, you’re likely to find lengthy paragraphs and more of them. 

Blog Posts and Online Articles

The typical paragraph length has been growing shorter in recent years, thanks to the ubiquity of mobile browsing. Since so many people consume content on mobile devices nowadays, authors must be conscious of how their writing will look on a small screen. For this reason, even respected news publications have shifted to a shorter paragraph structure. 

Many news articles and blog posts now include a large number of one-sentence paragraphs, even though such writing would have been considered too abrupt in the past. These ultra-short paragraphs allow for more white space and prove easier-to-read on mobile devices. When you’re writing for an online publication, keep your paragraphs short and direct. 

Creative Writing

Referring back to the definition at the top of this post, you’ll notice that a paragraph, “…gives the words of one speaker.” If you’re writing a novel or short story, this rule can be particularly helpful. With any piece of writing that contains a large amount of dialogue, you’ll be breaking for new paragraphs frequently. Even if a character only says a single word, you need to introduce a new paragraph before the next character’s line of dialogue. As you can imagine, a scene where two characters argue back and forth would require a large number of very short (sometimes even one-word) paragraphs. 

Tips to Remember

As a good rule of thumb, try experimenting with different paragraph lengths. First, master the standard paragraph format, which consists of 3-5 sentences with a clear beginning, middle, and end. Once you’ve succeeded and you feel confident writing paragraphs that transition smoothly, try writing some dialogue. See what it feels like to introduce a number of short paragraphs. Lastly, when you feel ready to mix things up, give long paragraphs some love. See how many words you can write before a new topic introduces itself. After some practice, you’ll feel comfortable writing paragraphs of differing lengths.

Pay attention to the paragraphs you read over the next few days! Notice how many sentences the writer includes in his or her paragraphs. You’ll probably see a large variance. A textbook might have ten-sentence paragraphs, whereas a news website might have one-sentence paragraphs. Ask yourself, “Which kinds of paragraphs do I most enjoy reading?” Then, try writing in that style. 

Who knows? Maybe you’ll find that you love writing 300-page paragraphs, like the author David Albahari. Just don’t be surprised if the person reading (or grading) your work doesn’t share that preference. 



The Word Counter  is a dynamic online tool used for counting words, characters, sentences, paragraphs, and pages in real time, along with spelling and grammar checking.

Kari Lisa Johnson

I’m an award-winning playwright with a penchant for wordplay. After earning a perfect score on the Writing SAT, I worked my way through Brown University by moonlighting as a Kaplan Test Prep tutor. I received a BA with honors in Literary Arts (Playwriting)—which gave me the opportunity to study under Pulitzer Prize-winner Paula Vogel. In my previous roles as new media producer with Rosetta Stone, director of marketing for global ventures with The Juilliard School, and vice president of digital strategy with Up & Coming Media, I helped develop the voice for international brands. From my home office in Maui, Hawaii, I currently work on freelance and ghostwriting projects.

Recent Posts

Act meaning: here’s what it means and how to use it, independent meaning: here’s what it means and how to use it, angel number 222 meaning: here’s what it means and how to use it, cornerstone meaning: here’s what it means and how to use it.

Automated page speed optimizations for fast site performance


Write your book in Reedsy Studio. Try the beloved writing app for free today.

Craft your masterpiece in Reedsy Studio

Plan, write, edit, and format your book in our free app made for authors.

Reedsy Community

Blog • Perfecting your Craft

Posted on Mar 13, 2024

How Many Sentences Are in a Paragraph?

In most forms of writing, paragraphs tend to be around four to eight sentences long . This general range will vary depending on the type of writing in question and the effect the writer is aiming to achieve.

In this guide, we’ll look at the length of paragraphs in various types of writing and see what determines whether they should be 20 sentences long, or stand alone as single sentences.

Which writing app is right for you?

Find out here! Takes 30 seconds

A paragraph should be as long as it needs to be

Reedsy editor Rebecca Heyman says a paragraph generally begins when a new idea is introduced. “A single sentence can stand on its own as a paragraph if its treatment of a specific theme or motif is complete. Conversely, denser, more complex topics may require a substantial number of sentences to adequately unpack meaning.” For example, in this very paragraph that you’re reading right now, we’re dealing with a fairly abstract concept which requires multiple sentences for clarification. In other words, a paragraph should be as long as it needs to be in order to convey its point. 

Image of a piece of a written piece of paper stating how long paragraphs should be.

Let’s now examine this idea from different perspectives, and look at how writers use paragraph breaks for different purposes.

In nonfiction, paragraphs tend to be longer

In nonfiction , where the purpose of writing is often to explain new concepts and ideas, paragraphs tend to be a bit on the longer side. They will often introduce an idea, explore it, and then draw conclusions based on that exploration.

In this paragraph from Stefano Mancuso’s The Revolutionary Genius of Plants , he introduces, explores, and concludes upon the intelligence of plants:

Even though they have nothing akin to a central brain, plants exhibit unmistakable attributes of intelligence. They are able to perceive their surroundings with a greater sensitivity than animals do. They actively compete for the limited resources in the soil and atmosphere; they evaluate their circumstances with precision; they perform sophisticated cost-benefit analyses; and, finally, they define and then take appropriate adaptive actions in response to environmental stimuli. Plants embody a model that is much more durable and innovative than that of animals; they are the living representation of how stability and flexibility can be combined. Their modular, diffused construction is the epitome of modernity: a cooperative, shared structure without any command centers, able to flawlessly resist repeated catastrophic events without losing functionality and adapt very quickly to huge environmental changes.

The idea is established simply in the first sentence: “plants exhibit unmistakable attributes of intelligence”. Mancuso then elaborates on this idea by discussing their perceptual and analytical properties, before concluding that plants' intelligence is an example of innovative and durable adaptability to the environment.

You’ll see a similar pattern across nonfiction and other types of  expository writing , whether you’re reading books on military history, self-help guides , or gardening manuals. Where the intention of the work is to inform or educate the reader, this tried-and-true way of structuring paragraphs allows information to be passed on in manageable chunks.

However, when you’re writing with the intention of telling a story in an enjoyable fashion, paragraph breaks tend to happen more often, and for different reasons.

how many sentences in a paragraph for essay

Perfect your self-help manuscript

Work with a professional to take your book to the next level.

In fiction, they can be as short as a sentence

With artistic works of writing, where the focus is on storytelling providing the reader with a satisfying narrative experience , you will often see the greatest range of paragraph length within a single work. A novelist might have three pages of unbroken narrative, punctuated by a one-word paragraph. 

In general, fiction writers will start a new paragraph whenever something new happens. For example:

Whenever dialogue or action switches between characters

In this extract from Gillian Flynn’s Gone Girl , the narrator, Nick, is being spoken to by his sister.

‘We were lost in the rain,’ she said in a voice that was pleading on the way to peeved. I finished the shrug. McMann’s, Nick. Remember, when we got lost in the rain in Chinatown[...]’

The first paragraph is a quick line of dialogue with a tag that indicates who is speaking. Nick then reacts with an action beat (his shrug) — which is in its own paragraph. Then there is another paragraph break to indicate that the next line is spoken once again by his sister.

In this context, paragraph breaks show the reader that we’re switching characters, allowing an author to avoid having to start every other sentence with “Margo said” or “I said”. 



How to Write Believable Dialogue

Master the art of dialogue in 10 five-minute lessons.

When the narration changes between action and reflection

In a narrative, paragraph transitions can also be a way to indicate that the narrator is changing their perspective — often from describing the action of a scene, to remarking on a character’s thoughts or inner reactions.

In this passage from All Quiet on the Western Front , Erich Maria Remarque uses short paragraphs — sometimes single sentences — to paint an impressionistic vignette of a man’s death in the trenches.

But every gasp strips my heart bare. The dying man is the master of these hours, he has an invisible dagger to stab me with: the dagger of time and my own thoughts.  I would give a lot for him to live. It is hard to lie here and have to watch and listen to him. By three in the afternoon he is dead.  I breathe again. But only for a short time. Soon the silence seems harder for me to bear than the groans. I would even like to hear the gurgling again; in fits and starts, hoarse, sometimes a soft whistling noise and then hoarse and loud again.

Still of two soldiers in a war zone from the movie All Quiet on the Western Front (1930)

Remarque utilizes paragraph transitions to depict the shift in Paul’s ( the main character ) focus, moving from the immediate sensory details of the scene to his internal reflections and emotional turmoil. Paul’s desire for the dying man to live, juxtaposed with the harsh reality of death in wartime, highlights the juxtaposition between his internal empathy and the tragic experience of war.

Whenever there’s a time jump

Time jumps are often a good place to start a new paragraph to make it visually clear that some amount of time has passed. In this passage from Oliver Twist, Dickens starts a new paragraph to indicate time jumps.

They were sad rags, to tell the truth; and Oliver had never had a new suit before.      One evening, about a week after the affair of the picture, as he was sitting talking to Mrs. Bedwin, there came a message down from Mr. Brownlow, that if Oliver Twist felt pretty well, he should like to see him in his study, and talk to him a little while.

Without a paragraph change, it would feel odd that the narrator is suddenly taking us forward by a week right in the middle of telling us about Oliver’s clothes. Instead, the paragraph break indicates that one part of the story is over and that the next part is about to begin.



Meet writing coaches on Reedsy

Industry insiders can help you hone your craft, finish your draft, and get published.

Paragraphs' length affects the pace of the writing

Paragraph length (along with sentence length) has a profound effect on the pace of one’s writing . A page or two of block paragraphs will take readers far longer to get through than several shorter paragraphs and often reflects whether something is occurring quickly or slowly within the narrative.

For example, in The Great Gatsby , Fitzgerald takes his time to describe a road that will play a significant role in the story.

‘About half way between West Egg and New York the motor-road hastily joins the railroad and runs beside it for a quarter of a mile, so as to shrink away from a certain desolate area of land. This is a valley of ashes—a fantastic farm where ashes grow like wheat into ridges and hills and grotesque gardens where ashes take the forms of houses and chimneys and rising smoke and finally, with a transcendent effort, of men who move dimly and already crumbling through the powdery air. Occasionally a line of gray cars crawls along an invisible track, gives out a ghastly creak and comes to rest, and immediately the ash-gray men swarm up with leaden spades and stir up an impenetrable cloud which screens their obscure operations from your sight.’

This longer style paragraph is used to great effect in representing this pastoral stretch of land between the party hubs of West Egg and New York City. Think of this long paragraph as the wide establishing shot in a movie, everything looks a bit slower from that perspective!

Equally, when Fitzgerald wants to pick up the pace, he uses shorter paragraphs, as seen in the following example:

‘What do you want money for, all of a sudden?’ ‘I’ve been here too long. I want to get away. My wife and I want to go west.’ ‘Your wife does!’ exclaimed Tom, startled. ‘She’s been talking about it for ten years.’ He rested for a moment against the pump, shading his eyes. ‘And now she’s going whether she wants to or not. I’m going to get her away.’  The coupé flashed by us with a flurry of dust and the flash of a waving hand.  ‘What do I owe you?’ demanded Tom harshly.  ‘I just got wised up to something funny the last two days,’ remarked Wilson. ‘That’s why I want to get away. That’s why I been bothering you about the car.’ ‘What do I owe you?’ ‘Dollar twenty.’

how many sentences in a paragraph for essay

Tom realizes both his mistress and wife are slipping from his grasp (image: Warner Bros)

During this part of the novel, Tom Buchanan (the antagonist ) is feeling cornered as both his wife and mistress are slipping away from him. The short paragraphs in quick succession show his irritability, giving readers an insight into how unsettled he is. Even the coupé driven by Gatsby flashes by him, the speed of which is also heightened by use of short paragraphs.

Shorter paragraphs introduce more white space on the page, which accelerates the pace of reading. This can be used powerfully, as in the examples above, to match the narrative when fast, uncontrollable, or sudden events happen or to give readers a wide, slow, or sluggish feeling.

Many paragraphs later, we hope you found this helpful. If you’ve struggled with how to determine your paragraph lengths, know that it’s something that many writers find challenging! By roughly following the advice in this post you’ll be able to powerfully use varying paragraph lengths to hook readers throughout your work.

Continue reading

Recommended posts from the Reedsy Blog

how many sentences in a paragraph for essay

How to Write an Autobiography: The Story of Your Life

Want to write your autobiography but aren’t sure where to start? This step-by-step guide will take you from opening lines to publishing it for everyone to read.

how many sentences in a paragraph for essay

What is the Climax of a Story? Examples & Tips

The climax is perhaps a story's most crucial moment, but many writers struggle to stick the landing. Let's see what makes for a great story climax.

how many sentences in a paragraph for essay

What is Tone in Literature? Definition & Examples

We show you, with supporting examples, how tone in literature influences readers' emotions and perceptions of a text.

how many sentences in a paragraph for essay

Writing Cozy Mysteries: 7 Essential Tips & Tropes

We show you how to write a compelling cozy mystery with advice from published authors and supporting examples from literature.

how many sentences in a paragraph for essay

Man vs Nature: The Most Compelling Conflict in Writing

What is man vs nature? Learn all about this timeless conflict with examples of man vs nature in books, television, and film.

how many sentences in a paragraph for essay

The Redemption Arc: Definition, Examples, and Writing Tips

Learn what it takes to redeem a character with these examples and writing tips.

Join a community of over 1 million authors

Reedsy is more than just a blog. Become a member today to discover how we can help you publish a beautiful book.

RBE | Illustration — We made a writing app for you | 2023-02

We made a writing app for you

Yes, you! Write. Format. Export for ebook and print. 100% free, always.

Reedsy Marketplace UI

1 million authors trust the professionals on Reedsy. Come meet them.

Enter your email or get started with a social account:

Writing academically: Paragraph structure

  • Academic style
  • Personal pronouns
  • Contractions
  • Abbreviations
  • Signposting

Paragraph structure

  • Using sources in your writing

Jump to content on this page:

“An appropriate use of paragraphs is an essential part of writing coherent and well-structured essays.” Don Shiach,   How to write essays

PEEL acronym - Point, evidence, explanation, link

  • A topic sentence – what is the overall point that the paragraph is making?
  • Evidence that supports your point – this is usually your cited material.
  • Explanation of why the point is important and how it helps with your overall argument.
  • A link (if necessary) to the next paragraph (or to the previous one if coming at the beginning of the paragraph) or back to the essay question.

This is a good order to use when you are new to writing academic essays - but as you get more accomplished you can adapt it as necessary. The important thing is to make sure all of these elements are present within the paragraph.

The sections below explain more about each of these elements.

how many sentences in a paragraph for essay

The topic sentence (Point)

This should appear early in the paragraph and is often, but not always, the first sentence.  It should clearly state the main point that you are making in the paragraph. When you are planning essays, writing down a list of your topic sentences is an excellent way to check that your argument flows well from one point to the next.

how many sentences in a paragraph for essay

This is the evidence that backs up your topic sentence. Why do you believe what you have written in your topic sentence? The evidence is usually paraphrased or quoted material from your reading . Depending on the nature of the assignment, it could also include:

  • Your own data (in a research project for example).
  • Personal experiences from practice (especially for Social Care, Health Sciences and Education).
  • Personal experiences from learning (in a reflective essay for example).

Any evidence from external sources should, of course, be referenced.

how many sentences in a paragraph for essay

Explanation (analysis)

This is the part of your paragraph where you explain to your reader why the evidence supports the point and why that point is relevant to your overall argument. It is where you answer the question 'So what?'. Tell the reader how the information in the paragraph helps you answer the question and how it leads to your conclusion. Your analysis should attempt to persuade the reader that your conclusion is the correct one.

These are the parts of your paragraphs that will get you the higher marks in any marking scheme.

how many sentences in a paragraph for essay

Links are optional but it will help your argument flow if you include them. They are sentences that help the reader understand how the parts of your argument are connected . Most commonly they come at the end of the paragraph but they can be equally effective at the beginning of the next one. Sometimes a link is split between the end of one paragraph and the beginning of the next (see the example paragraph below).

Paragraph structure video

Length of a paragraph

Academic paragraphs are usually between 200 and 300 words long (they vary more than this but it is a useful guide). The important thing is that they should be long enough to contain all the above material. Only move onto a new paragraph if you are making a new point. 

Many students make their paragraphs too short (because they are not including enough or any analysis) or too long (they are made up of several different points).

Example of an academic paragraph

Using storytelling in educational settings can enable educators to connect with their students because of inborn tendencies for humans to listen to stories.   Written languages have only existed for between 6,000 and 7,000 years (Daniels & Bright, 1995) before then, and continually ever since in many cultures, important lessons for life were passed on using the oral tradition of storytelling. These varied from simple informative tales, to help us learn how to find food or avoid danger, to more magical and miraculous stories designed to help us see how we can resolve conflict and find our place in society (Zipes, 2012). Oral storytelling traditions are still fundamental to native American culture and Rebecca Bishop, a native American public relations officer (quoted in Sorensen, 2012) believes that the physical act of storytelling is a special thing; children will automatically stop what they are doing and listen when a story is told. Professional communicators report that this continues to adulthood (Simmons, 2006; Stevenson, 2008).   This means that storytelling can be a powerful tool for connecting with students of all ages in a way that a list of bullet points in a PowerPoint presentation cannot. The emotional connection and innate, almost hardwired, need to listen when someone tells a story means that educators can teach memorable lessons in a uniquely engaging manner that is   common to all cultures. 

This cross-cultural element of storytelling can be seen when reading or listening to wisdom tales from around the world...

Key:   Topic sentence    Evidence (includes some analysis)    Analysis   Link (crosses into next paragraph)

  • << Previous: Signposting
  • Next: Using sources in your writing >>
  • Last Updated: Nov 10, 2023 4:11 PM
  • URL:
  • Login to LibApps
  • Library websites Privacy Policy
  • University of Hull privacy policy & cookies
  • Website terms and conditions
  • Accessibility
  • Report a problem
  • Link to facebook
  • Link to linkedin
  • Link to twitter
  • Link to youtube
  • Writing Tips

How Many Paragraphs Should an Essay Have?

How Many Paragraphs Should an Essay Have?

  • 6-minute read
  • 19th May 2023

You have an essay to write. You’ve researched the topic and crafted a strong thesis statement . Now it’s time to open the laptop and start tapping away on the keyboard. You know the required word count, but you’re unsure of one thing: How many paragraphs should you have in the essay? Gee, it would’ve been nice if your professor had specified that, huh?

No worries, friend, because in this post, we’ll provide a guide to how many paragraphs an essay should have . Generally, the number of paragraphs will depend on how many words and how many supporting details you need (more on that later). We’ll also explore the concept of paragraphs if you’re wondering what they’re all about. And remember, paragraphs serve a purpose. You can’t submit an essay without using them!

What Is a Paragraph?

You likely know what a paragraph is, but can you define it properly in plain English? Don’t feel bad if that question made you shake your head. Off the top of our heads, many of us can’t explain what a paragraph is .

A paragraph comprises at least five sentences about a particular topic. A paragraph must begin with a well-crafted topic sentence , which is then followed by ideas that support that sentence. To move the essay forward, the paragraph should flow well, and the sentences should be relevant.

Why Are Paragraphs Important?

Paragraphs expand on points you make about a topic, painting a vivid picture for the reader. Paragraphs break down information into chunks, which are easier to read than one giant, uninterrupted body of text. If your essay doesn’t use paragraphs, it likely won’t earn a good grade!

 How Many Paragraphs Are in an Essay?

As mentioned, the number of paragraphs will depend on the word count and the quantity of supporting ideas required. However, if you have to write at least 1,000 words, you should aim for at least five paragraphs. Every essay should have an introduction and a conclusion. The reader needs to get a basic introduction to the topic and understand your thesis statement. They must also see key takeaway points at the end of the essay.

As a rule, a five-paragraph essay would look like this:

  • Introduction (with thesis statement)
  • Main idea 1 (with supporting details)
  • Main idea 2 (with supporting details)
  • Main idea 3 (with supporting details)

Your supporting details should include material (such as quotations or facts) from credible sources when writing the main idea paragraphs.

If you think your essay could benefit from having more than five paragraphs, add them! Just make sure they’re relevant to the topic.

Professors don’t care so much about the number of paragraphs; they want you to satisfy the minimum word requirement. Assignment rubrics rarely state the number of required paragraphs. It will be up to you to decide how many to write, and we urge you to research the assigned topic before writing the essay. Your main ideas from the research will generate most of the paragraphs.

When Should I Start a New Paragraph?

Surprisingly, some students aren’t aware that they should break up some of the paragraphs in their essays . You need to start new paragraphs to keep your reader engaged.

As well as starting a new paragraph after the introduction and another for the conclusion, you should do so when you’re introducing a new idea or presenting contrasting information.

Find this useful?

Subscribe to our newsletter and get writing tips from our editors straight to your inbox.

Starting a paragraph often involves using transitional words or phrases to signal to the reader that you’re presenting a new idea. Failing to use these cues may cause confusion for the reader and undermine your essay’s coherence.

Let’s consider examples of transitional words and phrases in action in a conclusion. Note that the essay is about too much mobile device screen time and that transitional words and phrases can occur later in a paragraph too:

Thanks to “In conclusion” and “Additionally,” the reader clearly knows that they are now in the conclusion stage. They can also follow the logic and development of the essay more easily.

How Do I Know Whether I Have Enough Paragraphs?

While no magic number exists for how many paragraphs you need, you should know when you have enough to satisfy the requirements of the assignment. It helps if you can answer yes to the following questions:

  • Does my essay have both an introduction and a conclusion?
  • Have I provided enough main ideas with supporting details, including quotes and cited information?
  • Does my essay develop the thesis statement?
  • Does my essay adequately inform the reader about the topic?
  • Have I provided at least one takeaway for the reader?


Professors aren’t necessarily looking for a specific number of paragraphs in an essay; it’s the word count that matters. You should see the word count as a guide for a suitable number of paragraphs. As a rule, five paragraphs should suffice for a 1,000-word essay. As long as you have an introduction and a conclusion and provide enough supporting details for the main ideas in your body paragraphs, you should be good to go.

Remember to start a new paragraph when introducing new ideas or presenting contrasting information. Your reader needs to be able to follow the essay throughout, and a single, unbroken block of text would be difficult to read. Transitional words and phrases help start new paragraphs, so don’t forget to use them!

As with any writing, we always recommend proofreading your essay after you’ve finished it. This step will help to detect typos, extra spacing, and grammatical errors. A second pair of eyes is always useful, so we recommend asking our proofreading experts to review your essay . They’ll correct your grammar, ensure perfect spelling, and offer suggestions to improve your essay. You can even submit a 500-word document for free!

1. What is a paragraph and what is its purpose?

A paragraph is a group of sentences that expand on a single idea. The purpose of a paragraph is to introduce an idea and then develop it with supporting details.

2. What are the benefits of paragraphs?

Paragraphs make your essay easy to read by providing structure and flow. They let you transition from one idea to another. New paragraphs allow you to tell your reader that you’ve covered one point and are moving on to the next.

3. How many paragraphs does a typical essay have?

An essay of at least 1,000 words usually has five paragraphs. It’s best to use the required word count as a guide to the number of paragraphs you’ll need.

Share this article:

Post A New Comment

Got content that needs a quick turnaround? Let us polish your work. Explore our editorial business services.

9-minute read

How to Use Infographics to Boost Your Presentation

Is your content getting noticed? Capturing and maintaining an audience’s attention is a challenge when...

8-minute read

Why Interactive PDFs Are Better for Engagement

Are you looking to enhance engagement and captivate your audience through your professional documents? Interactive...

7-minute read

Seven Key Strategies for Voice Search Optimization

Voice search optimization is rapidly shaping the digital landscape, requiring content professionals to adapt their...

4-minute read

Five Creative Ways to Showcase Your Digital Portfolio

Are you a creative freelancer looking to make a lasting impression on potential clients or...

How to Ace Slack Messaging for Contractors and Freelancers

Effective professional communication is an important skill for contractors and freelancers navigating remote work environments....

3-minute read

How to Insert a Text Box in a Google Doc

Google Docs is a powerful collaborative tool, and mastering its features can significantly enhance your...

Logo Harvard University

Make sure your writing is the best it can be with our expert English proofreading and editing.

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

On Paragraphs

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

What is a paragraph?

A paragraph is a collection of related sentences dealing with a single topic. Learning to write good paragraphs will help you as a writer stay on track during your drafting and revision stages. Good paragraphing also greatly assists your readers in following a piece of writing. You can have fantastic ideas, but if those ideas aren't presented in an organized fashion, you will lose your readers (and fail to achieve your goals in writing).

The Basic Rule: Keep one idea to one paragraph

The basic rule of thumb with paragraphing is to keep one idea to one paragraph. If you begin to transition into a new idea, it belongs in a new paragraph. There are some simple ways to tell if you are on the same topic or a new one. You can have one idea and several bits of supporting evidence within a single paragraph. You can also have several points in a single paragraph as long as they relate to the overall topic of the paragraph. If the single points start to get long, then perhaps elaborating on each of them and placing them in their own paragraphs is the route to go.

Elements of a paragraph

To be as effective as possible, a paragraph should contain each of the following: Unity, Coherence, A Topic Sentence, and Adequate Development. As you will see, all of these traits overlap. Using and adapting them to your individual purposes will help you construct effective paragraphs.

The entire paragraph should concern itself with a single focus. If it begins with one focus or major point of discussion, it should not end with another or wander within different ideas.

Coherence is the trait that makes the paragraph easily understandable to a reader. You can help create coherence in your paragraphs by creating logical bridges and verbal bridges.

Logical bridges

  • The same idea of a topic is carried over from sentence to sentence
  • Successive sentences can be constructed in parallel form

Verbal bridges

  • Key words can be repeated in several sentences
  • Synonymous words can be repeated in several sentences
  • Pronouns can refer to nouns in previous sentences
  • Transition words can be used to link ideas from different sentences

A topic sentence

A topic sentence is a sentence that indicates in a general way what idea or thesis the paragraph is going to deal with. Although not all paragraphs have clear-cut topic sentences, and despite the fact that topic sentences can occur anywhere in the paragraph (as the first sentence, the last sentence, or somewhere in the middle), an easy way to make sure your reader understands the topic of the paragraph is to put your topic sentence near the beginning of the paragraph. (This is a good general rule for less experienced writers, although it is not the only way to do it). Regardless of whether you include an explicit topic sentence or not, you should be able to easily summarize what the paragraph is about.

Adequate development

The topic (which is introduced by the topic sentence) should be discussed fully and adequately. Again, this varies from paragraph to paragraph, depending on the author's purpose, but writers should be wary of paragraphs that only have two or three sentences. It's a pretty good bet that the paragraph is not fully developed if it is that short.

Some methods to make sure your paragraph is well-developed:

  • Use examples and illustrations
  • Cite data (facts, statistics, evidence, details, and others)
  • Examine testimony (what other people say such as quotes and paraphrases)
  • Use an anecdote or story
  • Define terms in the paragraph
  • Compare and contrast
  • Evaluate causes and reasons
  • Examine effects and consequences
  • Analyze the topic
  • Describe the topic
  • Offer a chronology of an event (time segments)

How do I know when to start a new paragraph?

You should start a new paragraph when:

  • When you begin a new idea or point. New ideas should always start in new paragraphs. If you have an extended idea that spans multiple paragraphs, each new point within that idea should have its own paragraph.
  • To contrast information or ideas. Separate paragraphs can serve to contrast sides in a debate, different points in an argument, or any other difference.
  • When your readers need a pause. Breaks between paragraphs function as a short "break" for your readers—adding these in will help your writing be more readable. You would create a break if the paragraph becomes too long or the material is complex.
  • When you are ending your introduction or starting your conclusion. Your introductory and concluding material should always be in a new paragraph. Many introductions and conclusions have multiple paragraphs depending on their content, length, and the writer's purpose.

Transitions and signposts

Two very important elements of paragraphing are signposts and transitions. Signposts are internal aids to assist readers; they usually consist of several sentences or a paragraph outlining what the article has covered and where the article will be going.

Transitions are usually one or several sentences that "transition" from one idea to the next. Transitions can be used at the end of most paragraphs to help the paragraphs flow one into the next.


Choose Your Test

Sat / act prep online guides and tips, how to write an introduction paragraph in 3 steps.

author image

General Education


It’s the roadmap to your essay, it’s the forecast for your argument, it’s...your introduction paragraph, and writing one can feel pretty intimidating. The introduction paragraph is a part of just about every kind of academic writing , from persuasive essays to research papers. But that doesn’t mean writing one is easy!

If trying to write an intro paragraph makes you feel like a Muggle trying to do magic, trust us: you aren’t alone. But there are some tips and tricks that can make the process easier—and that’s where we come in.

In this article, we’re going to explain how to write a captivating intro paragraph by covering the following info:  

  • A discussion of what an introduction paragraph is and its purpose in an essay
  • An overview of the most effective introduction paragraph format, with explanations of the three main parts of an intro paragraph
  • An analysis of real intro paragraph examples, with a discussion of what works and what doesn’t
  • A list of four top tips on how to write an introduction paragraph

Are you ready? Let’s begin!


What Is an Introduction Paragraph? 

An introduction paragraph is the first paragraph of an essay , paper, or other type of academic writing. Argumentative essays , book reports, research papers, and even personal  essays are common types of writing that require an introduction paragraph. Whether you’re writing a research paper for a science course or an argumentative essay for English class , you’re going to have to write an intro paragraph. 

So what’s the purpose of an intro paragraph? As a reader’s first impression of your essay, the intro paragraph should introduce the topic of your paper. 

Your introduction will also state any claims, questions, or issues that your paper will focus on. This is commonly known as your paper’s thesis . This condenses the overall point of your paper into one or two short sentences that your reader can come back and reference later.

But intro paragraphs need to do a bit more than just introduce your topic. An intro paragraph is also supposed to grab your reader’s attention. The intro paragraph is your chance to provide just enough info and intrigue to make your reader say, “Hey, this topic sounds interesting. I think I’ll keep reading this essay!” That can help your essay stand out from the crowd.

In most cases, an intro paragraph will be relatively short. A good intro will be clear, brief, purposeful, and focused. While there are some exceptions to this rule, it’s common for intro paragraphs to consist of three to five sentences . 

Effectively introducing your essay’s topic, purpose, and getting your reader invested in your essay sounds like a lot to ask from one little paragraph, huh? In the next section, we’ll demystify the intro paragraph format by breaking it down into its core parts . When you learn how to approach each part of an intro, writing one won’t seem so scary!


Once you figure out the three parts of an intro paragraph, writing one will be a piece of cake!

The 3 Main Parts of an Intro Paragraph

In general, an intro paragraph is going to have three main parts: a hook, context, and a thesis statement . Each of these pieces of the intro plays a key role in acquainting the reader with the topic and purpose of your essay. 

Below, we’ll explain how to start an introduction paragraph by writing an effective hook, providing context, and crafting a thesis statement. When you put these elements together, you’ll have an intro paragraph that does a great job of making a great first impression on your audience!

Intro Paragraph Part 1: The Hook

When it comes to how to start an introduction paragraph, o ne of the most common approaches is to start with something called a hook. 

What does hook mean here, though? Think of it this way: it’s like when you start a new Netflix series: you look up a few hours (and a few episodes) later and you say, “Whoa. I guess I must be hooked on this show!” 

That’s how the hook is supposed to work in an intro paragrap h: it should get your reader interested enough that they don’t want to press the proverbial “pause” button while they’re reading it . In other words, a hook is designed to grab your reader’s attention and keep them reading your essay! 

This means that the hook comes first in the intro paragraph format—it’ll be the opening sentence of your intro. 

It’s important to realize  that there are many different ways to write a good hook. But generally speaking, hooks must include these two things: what your topic is, and the angle you’re taking on that topic in your essay. 

One approach to writing a hook that works is starting with a general, but interesting, statement on your topic. In this type of hook, you’re trying to provide a broad introduction to your topic and your angle on the topic in an engaging way . 

For example, if you’re writing an essay about the role of the government in the American healthcare system, your hook might look something like this: 

There's a growing movement to require that the federal government provide affordable, effective healthcare for all Americans. 

This hook introduces the essay topic in a broad way (government and healthcare) by presenting a general statement on the topic. But the assumption presented in the hook can also be seen as controversial, which gets readers interested in learning more about what the writer—and the essay—has to say.

In other words, the statement above fulfills the goals of a good hook: it’s intriguing and provides a general introduction to the essay topic.

Intro Paragraph Part 2: Context

Once you’ve provided an attention-grabbing hook, you’ll want to give more context about your essay topic. Context refers to additional details that reveal the specific focus of your paper. So, whereas the hook provides a general introduction to your topic, context starts helping readers understand what exactly you’re going to be writing about

You can include anywhere from one to several sentences of context in your intro, depending on your teacher’s expectations, the length of your paper, and complexity of your topic. In these context-providing sentences, you want to begin narrowing the focus of your intro. You can do this by describing a specific issue or question about your topic that you’ll address in your essay. It also helps readers start to understand why the topic you’re writing about matters and why they should read about it. 

So, what counts as context for an intro paragraph? Context can be any important details or descriptions that provide background on existing perspectives, common cultural attitudes, or a specific situation or controversy relating to your essay topic. The context you include should acquaint your reader with the issues, questions, or events that motivated you to write an essay on your topic...and that your reader should know in order to understand your thesis. 

For instance, if you’re writing an essay analyzing the consequences of sexism in Hollywood, the context you include after your hook might make reference to the #metoo and #timesup movements that have generated public support for victims of sexual harassment. 

The key takeaway here is that context establishes why you’re addressing your topic and what makes it important. It also sets you up for success on the final piece of an intro paragraph: the thesis statement.

Elle Woods' statement offers a specific point of view on the topic of murder...which means it could serve as a pretty decent thesis statement!

Intro Paragraph Part 3: The Thesis

The final key part of how to write an intro paragraph is the thesis statement. The thesis statement is the backbone of your introduction: it conveys your argument or point of view on your topic in a clear, concise, and compelling way . The thesis is usually the last sentence of your intro paragraph. 

Whether it’s making a claim, outlining key points, or stating a hypothesis, your thesis statement will tell your reader exactly what idea(s) are going to be addressed in your essay. A good thesis statement will be clear, straightforward, and highlight the overall point you’re trying to make.

Some instructors also ask students to include an essay map as part of their thesis. An essay map is a section that outlines the major topics a paper will address. So for instance, say you’re writing a paper that argues for the importance of public transport in rural communities. Your thesis and essay map might look like this: 

Having public transport in rural communities helps people improve their economic situation by giving them reliable transportation to their job, reducing the amount of money they spend on gas, and providing new and unionized work .

The underlined section is the essay map because it touches on the three big things the writer will talk about later. It literally maps out the rest of the essay!

So let’s review: Your thesis takes the idea you’ve introduced in your hook and context and wraps it up. Think of it like a television episode: the hook sets the scene by presenting a general statement and/or interesting idea that sucks you in. The context advances the plot by describing the topic in more detail and helping readers understand why the topic is important. And finally, the thesis statement provides the climax by telling the reader what you have to say about the topic. 

The thesis statement is the most important part of the intro. Without it, your reader won’t know what the purpose of your essay is! And for a piece of writing to be effective, it needs to have a clear purpose. Your thesis statement conveys that purpose , so it’s important to put careful thought into writing a clear and compelling thesis statement. 


How To Write an Introduction Paragraph: Example and Analysis

Now that we’ve provided an intro paragraph outline and have explained the three key parts of an intro paragraph, let’s take a look at an intro paragraph in action.

To show you how an intro paragraph works, we’ve included a sample introduction paragraph below, followed by an analysis of its strengths and weaknesses.

Example of Introduction Paragraph

While college students in the U.S. are struggling with how to pay for college, there is another surprising demographic that’s affected by the pressure to pay for college: families and parents. In the face of tuition price tags that total more than $100,000 (as a low estimate), families must make difficult decisions about how to save for their children’s college education. Charting a feasible path to saving for college is further complicated by the FAFSA’s estimates for an “Expected Family Contribution”—an amount of money that is rarely feasible for most American families. Due to these challenging financial circumstances and cultural pressure to give one’s children the best possible chance of success in adulthood, many families are going into serious debt to pay for their children’s college education. The U.S. government should move toward bearing more of the financial burden of college education. 

Example of Introduction Paragraph: Analysis

Before we dive into analyzing the strengths and weaknesses of this example intro paragraph, let’s establish the essay topic. The sample intro indicates that t he essay topic will focus on one specific issue: who should cover the cost of college education in the U.S., and why. Both the hook and the context help us identify the topic, while the thesis in the last sentence tells us why this topic matters to the writer—they think the U.S. Government needs to help finance college education. This is also the writer’s argument, which they’ll cover in the body of their essay. 

Now that we’ve identified the essay topic presented in the sample intro, let’s dig into some analysis. To pin down its strengths and weaknesses, we’re going to use the following three questions to guide our example of introduction paragraph analysis: 

  • Does this intro provide an attention-grabbing opening sentence that conveys the essay topic? 
  • Does this intro provide relevant, engaging context about the essay topic? 
  • Does this intro provide a thesis statement that establishes the writer’s point of view on the topic and what specific aspects of the issue the essay will address? 

Now, let’s use the questions above to analyze the strengths and weaknesses of this sample intro paragraph. 

Does the Intro Have a Good Hook? 

First, the intro starts out with an attention-grabbing hook . The writer starts by presenting  an assumption (that the U.S. federal government bears most of the financial burden of college education), which makes the topic relatable to a wide audience of readers. Also note that the hook relates to the general topic of the essay, which is the high cost of college education. 

The hook then takes a surprising turn by presenting a counterclaim : that American families, rather than students, feel the true burden of paying for college. Some readers will have a strong emotional reaction to this provocative counterclaim, which will make them want to keep reading! As such, this intro provides an effective opening sentence that conveys the essay topic. 

Does the Intro Give Context?

T he second, third, and fourth sentences of the intro provide contextual details that reveal the specific focus of the writer’s paper . Remember: the context helps readers start to zoom in on what the paper will focus on, and what aspect of the general topic (college costs) will be discussed later on. 

The context in this intro reveals the intent and direction of the paper by explaining why the issue of families financing college is important. In other words, the context helps readers understand why this issue matters , and what aspects of this issue will be addressed in the paper.  

To provide effective context, the writer refers to issues (the exorbitant cost of college and high levels of family debt) that have received a lot of recent scholarly and media attention. These sentences of context also elaborate on the interesting perspective included in the hook: that American families are most affected by college costs.

Does the Intro Have a Thesis? 

Finally, this intro provides a thesis statement that conveys the writer’s point of view on the issue of financing college education. This writer believes that the U.S. government should do more to pay for students’ college educations. 

However, the thesis statement doesn’t give us any details about why the writer has made this claim or why this will help American families . There isn’t an essay map that helps readers understand what points the writer will make in the essay.

To revise this thesis statement so that it establishes the specific aspects of the topic that the essay will address, the writer could add the following to the beginning of the thesis statement:

The U.S. government should take on more of the financial burden of college education because other countries have shown this can improve education rates while reducing levels of familial poverty.

Check out the new section in bold. Not only does it clarify that the writer is talking about the pressure put on families, it touches on the big topics the writer will address in the paper: improving education rates and reduction of poverty. So not only do we have a clearer argumentative statement in this thesis, we also have an essay map!  

So, let’s recap our analysis. This sample intro paragraph does an effective job of providing an engaging hook and relatable, interesting context, but the thesis statement needs some work ! As you write your own intro paragraphs, you might consider using the questions above to evaluate and revise your work. Doing this will help ensure you’ve covered all of your bases and written an intro that your readers will find interesting!


4 Tips for How To Write an Introduction Paragraph

Now that we’ve gone over an example of introduction paragraph analysis, let’s talk about how to write an introduction paragraph of your own. Keep reading for four tips for writing a successful intro paragraph for any essay. 

Tip 1: Analyze Your Essay Prompt

If you’re having trouble with how to start an introduction paragraph, analyze your essay prompt! Most teachers give you some kind of assignment sheet, formal instructions, or prompt to set the expectations for an essay they’ve assigned, right? Those instructions can help guide you as you write your intro paragraph!

Because they’ll be reading and responding to your essay, you want to make sure you meet your teacher’s expectations for an intro paragraph . For instance, if they’ve provided specific instructions about how long the intro should be or where the thesis statement should be located, be sure to follow them!

The type of paper you’re writing can give you clues as to how to approach your intro as well. If you’re writing a research paper, your professor might expect you to provide a research question or state a hypothesis in your intro. If you’re writing an argumentative essay, you’ll need to make sure your intro overviews the context surrounding your argument and your thesis statement includes a clear, defensible claim. 

Using the parameters set out by your instructor and assignment sheet can put some easy-to-follow boundaries in place for things like your intro’s length, structure, and content. Following these guidelines can free you up to focus on other aspects of your intro... like coming up with an exciting hook and conveying your point of view on your topic!

Tip 2: Narrow Your Topic

You can’t write an intro paragraph without first identifying your topic. To make your intro as effective as possible, you need to define the parameters of your topic clearly—and you need to be specific. 

For example, let’s say you want to write about college football. “NCAA football” is too broad of a topic for a paper. There is a lot to talk about in terms of college football! It would be tough to write an intro paragraph that’s focused, purposeful, and engaging on this topic. In fact, if you did try to address this whole topic, you’d probably end up writing a book!

Instead, you should narrow broad topics to  identify a specific question, claim, or issue pertaining to some aspect of NCAA football for your intro to be effective. So, for instance, you could frame your topic as, “How can college professors better support NCAA football players in academics?” This focused topic pertaining to NCAA football would give you a more manageable angle to discuss in your paper.

So before you think about writing your intro, ask yourself: Is my essay topic specific, focused, and logical? Does it convey an issue or question that I can explore over the course of several pages? Once you’ve established a good topic, you’ll have the foundation you need to write an effective intro paragraph . 


Once you've figured out your topic, it's time to hit the books!

Tip 3: Do Your Research

This tip is tightly intertwined with the one above, and it’s crucial to writing a good intro: do your research! And, guess what? This tip applies to all papers—even ones that aren’t technically research papers. 

Here’s why you need to do some research: getting the lay of the land on what others have said about your topic—whether that’s scholars and researchers or the mass media— will help you narrow your topic, write an engaging hook, and provide relatable context. 

You don't want to sit down to write your intro without a solid understanding of the different perspectives on your topic. Whether those are the perspectives of experts or the general public, these points of view will help you write your intro in a way that is intriguing and compelling for your audience of readers. 

Tip 4: Write Multiple Drafts

Some say to write your intro first; others say write it last. The truth is, there isn’t a right or wrong time to write your intro—but you do need to have enough time to write multiple drafts . 

Oftentimes, your professor will ask you to write multiple drafts of your paper, which gives you a built-in way to make sure you revise your intro. Another approach you could take is to write out a rough draft of your intro before you begin writing your essay, then revise it multiple times as you draft out your paper. 

Here’s why this approach can work: as you write your paper, you’ll probably come up with new insights on your topic that you didn’t have right from the start. You can use these “light bulb” moments to reevaluate your intro and make revisions that keep it in line with your developing essay draft. 

Once you’ve written your entire essay, consider going back and revising your intro again . You can ask yourself these questions as you evaluate your intro: 

  • Is my hook still relevant to the way I’ve approached the topic in my essay?
  • Do I provide enough appropriate context to introduce my essay? 
  • Now that my essay is written, does my thesis statement still accurately reflect the point of view that I present in my essay?

Using these questions as a guide and putting your intro through multiple revisions will help ensure that you’ve written the best intro for the final draft of your essay. Also, revising your writing is always a good thing to do—and this applies to your intro, too!


What's Next?

Your college essays also need great intro paragraphs. Here’s a guide that focuses on how to write the perfect intro for your admissions essays. 

Of course, the intro is just one part of your college essay . This article will teach you how to write a college essay that makes admissions counselors sit up and take notice.

Are you trying to write an analytical essay? Our step-by-step guide can help you knock it out of the park.

author image

Ashley Sufflé Robinson has a Ph.D. in 19th Century English Literature. As a content writer for PrepScholar, Ashley is passionate about giving college-bound students the in-depth information they need to get into the school of their dreams.

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

  • For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

Follow us on Facebook (icon)

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Applying large language models for automated essay scoring for non-native Japanese

  • Wenchao Li 1 &
  • Haitao Liu 2  

Humanities and Social Sciences Communications volume  11 , Article number:  723 ( 2024 ) Cite this article

12 Accesses

1 Altmetric

Metrics details

  • Language and linguistics

Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated listening tests, and automated oral proficiency assessments. The application of LLMs for AES in the context of non-native Japanese, however, remains limited. This study explores the potential of LLM-based AES by comparing the efficiency of different models, i.e. two conventional machine training technology-based methods (Jess and JWriter), two LLMs (GPT and BERT), and one Japanese local LLM (Open-Calm large model). To conduct the evaluation, a dataset consisting of 1400 story-writing scripts authored by learners with 12 different first languages was used. Statistical analysis revealed that GPT-4 outperforms Jess and JWriter, BERT, and the Japanese language-specific trained Open-Calm large model in terms of annotation accuracy and predicting learning levels. Furthermore, by comparing 18 different models that utilize various prompts, the study emphasized the significance of prompts in achieving accurate and reliable evaluations using LLMs.

Similar content being viewed by others

how many sentences in a paragraph for essay

Scoring method of English composition integrating deep learning in higher vocational colleges

how many sentences in a paragraph for essay

ChatGPT-3.5 as writing assistance in students’ essays

how many sentences in a paragraph for essay

Detecting contract cheating through linguistic fingerprint

Conventional machine learning technology in aes.

AES has experienced significant growth with the advancement of machine learning technologies in recent decades. In the earlier stages of AES development, conventional machine learning-based approaches were commonly used. These approaches involved the following procedures: a) feeding the machine with a dataset. In this step, a dataset of essays is provided to the machine learning system. The dataset serves as the basis for training the model and establishing patterns and correlations between linguistic features and human ratings. b) the machine learning model is trained using linguistic features that best represent human ratings and can effectively discriminate learners’ writing proficiency. These features include lexical richness (Lu, 2012 ; Kyle and Crossley, 2015 ; Kyle et al. 2021 ), syntactic complexity (Lu, 2010 ; Liu, 2008 ), text cohesion (Crossley and McNamara, 2016 ), and among others. Conventional machine learning approaches in AES require human intervention, such as manual correction and annotation of essays. This human involvement was necessary to create a labeled dataset for training the model. Several AES systems have been developed using conventional machine learning technologies. These include the Intelligent Essay Assessor (Landauer et al. 2003 ), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006 ; Burstein, 2003 ), MyAccess with the InterlliMetric scoring engine by Vantage Learning (Elliot, 2003 ), and the Bayesian Essay Test Scoring system (Rudner and Liang, 2002 ). These systems have played a significant role in automating the essay scoring process and providing quick and consistent feedback to learners. However, as touched upon earlier, conventional machine learning approaches rely on predetermined linguistic features and often require manual intervention, making them less flexible and potentially limiting their generalizability to different contexts.

In the context of the Japanese language, conventional machine learning-incorporated AES tools include Jess (Ishioka and Kameda, 2006 ) and JWriter (Lee and Hasebe, 2017 ). Jess assesses essays by deducting points from the perfect score, utilizing the Mainichi Daily News newspaper as a database. The evaluation criteria employed by Jess encompass various aspects, such as rhetorical elements (e.g., reading comprehension, vocabulary diversity, percentage of complex words, and percentage of passive sentences), organizational structures (e.g., forward and reverse connection structures), and content analysis (e.g., latent semantic indexing). JWriter employs linear regression analysis to assign weights to various measurement indices, such as average sentence length and total number of characters. These weights are then combined to derive the overall score. A pilot study involving the Jess model was conducted on 1320 essays at different proficiency levels, including primary, intermediate, and advanced. However, the results indicated that the Jess model failed to significantly distinguish between these essay levels. Out of the 16 measures used, four measures, namely median sentence length, median clause length, median number of phrases, and maximum number of phrases, did not show statistically significant differences between the levels. Additionally, two measures exhibited between-level differences but lacked linear progression: the number of attributives declined words and the Kanji/kana ratio. On the other hand, the remaining measures, including maximum sentence length, maximum clause length, number of attributive conjugated words, maximum number of consecutive infinitive forms, maximum number of conjunctive-particle clauses, k characteristic value, percentage of big words, and percentage of passive sentences, demonstrated statistically significant between-level differences and displayed linear progression.

Both Jess and JWriter exhibit notable limitations, including the manual selection of feature parameters and weights, which can introduce biases into the scoring process. The reliance on human annotators to label non-native language essays also introduces potential noise and variability in the scoring. Furthermore, an important concern is the possibility of system manipulation and cheating by learners who are aware of the regression equation utilized by the models (Hirao et al. 2020 ). These limitations emphasize the need for further advancements in AES systems to address these challenges.

Deep learning technology in AES

Deep learning has emerged as one of the approaches for improving the accuracy and effectiveness of AES. Deep learning-based AES methods utilize artificial neural networks that mimic the human brain’s functioning through layered algorithms and computational units. Unlike conventional machine learning, deep learning autonomously learns from the environment and past errors without human intervention. This enables deep learning models to establish nonlinear correlations, resulting in higher accuracy. Recent advancements in deep learning have led to the development of transformers, which are particularly effective in learning text representations. Noteworthy examples include bidirectional encoder representations from transformers (BERT) (Devlin et al. 2019 ) and the generative pretrained transformer (GPT) (OpenAI).

BERT is a linguistic representation model that utilizes a transformer architecture and is trained on two tasks: masked linguistic modeling and next-sentence prediction (Hirao et al. 2020 ; Vaswani et al. 2017 ). In the context of AES, BERT follows specific procedures, as illustrated in Fig. 1 : (a) the tokenized prompts and essays are taken as input; (b) special tokens, such as [CLS] and [SEP], are added to mark the beginning and separation of prompts and essays; (c) the transformer encoder processes the prompt and essay sequences, resulting in hidden layer sequences; (d) the hidden layers corresponding to the [CLS] tokens (T[CLS]) represent distributed representations of the prompts and essays; and (e) a multilayer perceptron uses these distributed representations as input to obtain the final score (Hirao et al. 2020 ).

figure 1

AES system with BERT (Hirao et al. 2020 ).

The training of BERT using a substantial amount of sentence data through the Masked Language Model (MLM) allows it to capture contextual information within the hidden layers. Consequently, BERT is expected to be capable of identifying artificial essays as invalid and assigning them lower scores (Mizumoto and Eguchi, 2023 ). In the context of AES for nonnative Japanese learners, Hirao et al. ( 2020 ) combined the long short-term memory (LSTM) model proposed by Hochreiter and Schmidhuber ( 1997 ) with BERT to develop a tailored automated Essay Scoring System. The findings of their study revealed that the BERT model outperformed both the conventional machine learning approach utilizing character-type features such as “kanji” and “hiragana”, as well as the standalone LSTM model. Takeuchi et al. ( 2021 ) presented an approach to Japanese AES that eliminates the requirement for pre-scored essays by relying solely on reference texts or a model answer for the essay task. They investigated multiple similarity evaluation methods, including frequency of morphemes, idf values calculated on Wikipedia, LSI, LDA, word-embedding vectors, and document vectors produced by BERT. The experimental findings revealed that the method utilizing the frequency of morphemes with idf values exhibited the strongest correlation with human-annotated scores across different essay tasks. The utilization of BERT in AES encounters several limitations. Firstly, essays often exceed the model’s maximum length limit. Second, only score labels are available for training, which restricts access to additional information.

Mizumoto and Eguchi ( 2023 ) were pioneers in employing the GPT model for AES in non-native English writing. Their study focused on evaluating the accuracy and reliability of AES using the GPT-3 text-davinci-003 model, analyzing a dataset of 12,100 essays from the corpus of nonnative written English (TOEFL11). The findings indicated that AES utilizing the GPT-3 model exhibited a certain degree of accuracy and reliability. They suggest that GPT-3-based AES systems hold the potential to provide support for human ratings. However, applying GPT model to AES presents a unique natural language processing (NLP) task that involves considerations such as nonnative language proficiency, the influence of the learner’s first language on the output in the target language, and identifying linguistic features that best indicate writing quality in a specific language. These linguistic features may differ morphologically or syntactically from those present in the learners’ first language, as observed in (1)–(3).


Wǒ-sòngle-tā-yī běn-shū

1 sg .-give. past- him-one .cl- book

“I gave him a book.”




3 sg .- dat -hon- acc- give.honorification. past


give, give-s, gave, given, giving

Additionally, the morphological agglutination and subject-object-verb (SOV) order in Japanese, along with its idiomatic expressions, pose additional challenges for applying language models in AES tasks (4).

足-が 棒-に なり-ました

Ashi-ga bo-ni nar-mashita

leg- nom stick- dat become- past

“My leg became like a stick (I am extremely tired).”

The example sentence provided demonstrates the morpho-syntactic structure of Japanese and the presence of an idiomatic expression. In this sentence, the verb “なる” (naru), meaning “to become”, appears at the end of the sentence. The verb stem “なり” (nari) is attached with morphemes indicating honorification (“ます” - mashu) and tense (“た” - ta), showcasing agglutination. While the sentence can be literally translated as “my leg became like a stick”, it carries an idiomatic interpretation that implies “I am extremely tired”.

To overcome this issue, CyberAgent Inc. ( 2023 ) has developed the Open-Calm series of language models specifically designed for Japanese. Open-Calm consists of pre-trained models available in various sizes, such as Small, Medium, Large, and 7b. Figure 2 depicts the fundamental structure of the Open-Calm model. A key feature of this architecture is the incorporation of the Lora Adapter and GPT-NeoX frameworks, which can enhance its language processing capabilities.

figure 2

GPT-NeoX Model Architecture (Okgetheng and Takeuchi 2024 ).

In a recent study conducted by Okgetheng and Takeuchi ( 2024 ), they assessed the efficacy of Open-Calm language models in grading Japanese essays. The research utilized a dataset of approximately 300 essays, which were annotated by native Japanese educators. The findings of the study demonstrate the considerable potential of Open-Calm language models in automated Japanese essay scoring. Specifically, among the Open-Calm family, the Open-Calm Large model (referred to as OCLL) exhibited the highest performance. However, it is important to note that, as of the current date, the Open-Calm Large model does not offer public access to its server. Consequently, users are required to independently deploy and operate the environment for OCLL. In order to utilize OCLL, users must have a PC equipped with an NVIDIA GeForce RTX 3060 (8 or 12 GB VRAM).

In summary, while the potential of LLMs in automated scoring of nonnative Japanese essays has been demonstrated in two studies—BERT-driven AES (Hirao et al. 2020 ) and OCLL-based AES (Okgetheng and Takeuchi, 2024 )—the number of research efforts in this area remains limited.

Another significant challenge in applying LLMs to AES lies in prompt engineering and ensuring its reliability and effectiveness (Brown et al. 2020 ; Rae et al. 2021 ; Zhang et al. 2021 ). Various prompting strategies have been proposed, such as the zero-shot chain of thought (CoT) approach (Kojima et al. 2022 ), which involves manually crafting diverse and effective examples. However, manual efforts can lead to mistakes. To address this, Zhang et al. ( 2021 ) introduced an automatic CoT prompting method called Auto-CoT, which demonstrates matching or superior performance compared to the CoT paradigm. Another prompt framework is trees of thoughts, enabling a model to self-evaluate its progress at intermediate stages of problem-solving through deliberate reasoning (Yao et al. 2023 ).

Beyond linguistic studies, there has been a noticeable increase in the number of foreign workers in Japan and Japanese learners worldwide (Ministry of Health, Labor, and Welfare of Japan, 2022 ; Japan Foundation, 2021 ). However, existing assessment methods, such as the Japanese Language Proficiency Test (JLPT), J-CAT, and TTBJ Footnote 1 , primarily focus on reading, listening, vocabulary, and grammar skills, neglecting the evaluation of writing proficiency. As the number of workers and language learners continues to grow, there is a rising demand for an efficient AES system that can reduce costs and time for raters and be utilized for employment, examinations, and self-study purposes.

This study aims to explore the potential of LLM-based AES by comparing the effectiveness of five models: two LLMs (GPT Footnote 2 and BERT), one Japanese local LLM (OCLL), and two conventional machine learning-based methods (linguistic feature-based scoring tools - Jess and JWriter).

The research questions addressed in this study are as follows:

To what extent do the LLM-driven AES and linguistic feature-based AES, when used as automated tools to support human rating, accurately reflect test takers’ actual performance?

What influence does the prompt have on the accuracy and performance of LLM-based AES methods?

The subsequent sections of the manuscript cover the methodology, including the assessment measures for nonnative Japanese writing proficiency, criteria for prompts, and the dataset. The evaluation section focuses on the analysis of annotations and rating scores generated by LLM-driven and linguistic feature-based AES methods.


The dataset utilized in this study was obtained from the International Corpus of Japanese as a Second Language (I-JAS) Footnote 3 . This corpus consisted of 1000 participants who represented 12 different first languages. For the study, the participants were given a story-writing task on a personal computer. They were required to write two stories based on the 4-panel illustrations titled “Picnic” and “The key” (see Appendix A). Background information for the participants was provided by the corpus, including their Japanese language proficiency levels assessed through two online tests: J-CAT and SPOT. These tests evaluated their reading, listening, vocabulary, and grammar abilities. The learners’ proficiency levels were categorized into six levels aligned with the Common European Framework of Reference for Languages (CEFR) and the Reference Framework for Japanese Language Education (RFJLE): A1, A2, B1, B2, C1, and C2. According to Lee et al. ( 2015 ), there is a high level of agreement (r = 0.86) between the J-CAT and SPOT assessments, indicating that the proficiency certifications provided by J-CAT are consistent with those of SPOT. However, it is important to note that the scores of J-CAT and SPOT do not have a one-to-one correspondence. In this study, the J-CAT scores were used as a benchmark to differentiate learners of different proficiency levels. A total of 1400 essays were utilized, representing the beginner (aligned with A1), A2, B1, B2, C1, and C2 levels based on the J-CAT scores. Table 1 provides information about the learners’ proficiency levels and their corresponding J-CAT and SPOT scores.

A dataset comprising a total of 1400 essays from the story writing tasks was collected. Among these, 714 essays were utilized to evaluate the reliability of the LLM-based AES method, while the remaining 686 essays were designated as development data to assess the LLM-based AES’s capability to distinguish participants with varying proficiency levels. The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt . All essays were sent to the model for measurement and scoring.

Measures of writing proficiency for nonnative Japanese

Japanese exhibits a morphologically agglutinative structure where morphemes are attached to the word stem to convey grammatical functions such as tense, aspect, voice, and honorifics, e.g. (5).



[eat (stem)-causative-passive voice-honorification-tense. past-question marker]

Japanese employs nine case particles to indicate grammatical functions: the nominative case particle が (ga), the accusative case particle を (o), the genitive case particle の (no), the dative case particle に (ni), the locative/instrumental case particle で (de), the ablative case particle から (kara), the directional case particle へ (e), and the comitative case particle と (to). The agglutinative nature of the language, combined with the case particle system, provides an efficient means of distinguishing between active and passive voice, either through morphemes or case particles, e.g. 食べる taberu “eat concusive . ” (active voice); 食べられる taberareru “eat concusive . ” (passive voice). In the active voice, “パン を 食べる” (pan o taberu) translates to “to eat bread”. On the other hand, in the passive voice, it becomes “パン が 食べられた” (pan ga taberareta), which means “(the) bread was eaten”. Additionally, it is important to note that different conjugations of the same lemma are considered as one type in order to ensure a comprehensive assessment of the language features. For example, e.g., 食べる taberu “eat concusive . ”; 食べている tabeteiru “eat progress .”; 食べた tabeta “eat past . ” as one type.

To incorporate these features, previous research (Suzuki, 1999 ; Watanabe et al. 1988 ; Ishioka, 2001 ; Ishioka and Kameda, 2006 ; Hirao et al. 2020 ) has identified complexity, fluency, and accuracy as crucial factors for evaluating writing quality. These criteria are assessed through various aspects, including lexical richness (lexical density, diversity, and sophistication), syntactic complexity, and cohesion (Kyle et al. 2021 ; Mizumoto and Eguchi, 2023 ; Ure, 1971 ; Halliday, 1985 ; Barkaoui and Hadidi, 2020 ; Zenker and Kyle, 2021 ; Kim et al. 2018 ; Lu, 2017 ; Ortega, 2015 ). Therefore, this study proposes five scoring categories: lexical richness, syntactic complexity, cohesion, content elaboration, and grammatical accuracy. A total of 16 measures were employed to capture these categories. The calculation process and specific details of these measures can be found in Table 2 .

T-unit, first introduced by Hunt ( 1966 ), is a measure used for evaluating speech and composition. It serves as an indicator of syntactic development and represents the shortest units into which a piece of discourse can be divided without leaving any sentence fragments. In the context of Japanese language assessment, Sakoda and Hosoi ( 2020 ) utilized T-unit as the basic unit to assess the accuracy and complexity of Japanese learners’ speaking and storytelling. The calculation of T-units in Japanese follows the following principles:

A single main clause constitutes 1 T-unit, regardless of the presence or absence of dependent clauses, e.g. (6).

ケンとマリはピクニックに行きました (main clause): 1 T-unit.

If a sentence contains a main clause along with subclauses, each subclause is considered part of the same T-unit, e.g. (7).

天気が良かった の で (subclause)、ケンとマリはピクニックに行きました (main clause): 1 T-unit.

In the case of coordinate clauses, where multiple clauses are connected, each coordinated clause is counted separately. Thus, a sentence with coordinate clauses may have 2 T-units or more, e.g. (8).

ケンは地図で場所を探して (coordinate clause)、マリはサンドイッチを作りました (coordinate clause): 2 T-units.

Lexical diversity refers to the range of words used within a text (Engber, 1995 ; Kyle et al. 2021 ) and is considered a useful measure of the breadth of vocabulary in L n production (Jarvis, 2013a , 2013b ).

The type/token ratio (TTR) is widely recognized as a straightforward measure for calculating lexical diversity and has been employed in numerous studies. These studies have demonstrated a strong correlation between TTR and other methods of measuring lexical diversity (e.g., Bentz et al. 2016 ; Čech and Miroslav, 2018 ; Çöltekin and Taraka, 2018 ). TTR is computed by considering both the number of unique words (types) and the total number of words (tokens) in a given text. Given that the length of learners’ writing texts can vary, this study employs the moving average type-token ratio (MATTR) to mitigate the influence of text length. MATTR is calculated using a 50-word moving window. Initially, a TTR is determined for words 1–50 in an essay, followed by words 2–51, 3–52, and so on until the end of the essay is reached (Díez-Ortega and Kyle, 2023 ). The final MATTR scores were obtained by averaging the TTR scores for all 50-word windows. The following formula was employed to derive MATTR:

\({\rm{MATTR}}({\rm{W}})=\frac{{\sum }_{{\rm{i}}=1}^{{\rm{N}}-{\rm{W}}+1}{{\rm{F}}}_{{\rm{i}}}}{{\rm{W}}({\rm{N}}-{\rm{W}}+1)}\)

Here, N refers to the number of tokens in the corpus. W is the randomly selected token size (W < N). \({F}_{i}\) is the number of types in each window. The \({\rm{MATTR}}({\rm{W}})\) is the mean of a series of type-token ratios (TTRs) based on the word form for all windows. It is expected that individuals with higher language proficiency will produce texts with greater lexical diversity, as indicated by higher MATTR scores.

Lexical density was captured by the ratio of the number of lexical words to the total number of words (Lu, 2012 ). Lexical sophistication refers to the utilization of advanced vocabulary, often evaluated through word frequency indices (Crossley et al. 2013 ; Haberman, 2008 ; Kyle and Crossley, 2015 ; Laufer and Nation, 1995 ; Lu, 2012 ; Read, 2000 ). In line of writing, lexical sophistication can be interpreted as vocabulary breadth, which entails the appropriate usage of vocabulary items across various lexicon-grammatical contexts and registers (Garner et al. 2019 ; Kim et al. 2018 ; Kyle et al. 2018 ). In Japanese specifically, words are considered lexically sophisticated if they are not included in the “Japanese Education Vocabulary List Ver 1.0”. Footnote 4 Consequently, lexical sophistication was calculated by determining the number of sophisticated word types relative to the total number of words per essay. Furthermore, it has been suggested that, in Japanese writing, sentences should ideally have a length of no more than 40 to 50 characters, as this promotes readability. Therefore, the median and maximum sentence length can be considered as useful indices for assessment (Ishioka and Kameda, 2006 ).

Syntactic complexity was assessed based on several measures, including the mean length of clauses, verb phrases per T-unit, clauses per T-unit, dependent clauses per T-unit, complex nominals per clause, adverbial clauses per clause, coordinate phrases per clause, and mean dependency distance (MDD). The MDD reflects the distance between the governor and dependent positions in a sentence. A larger dependency distance indicates a higher cognitive load and greater complexity in syntactic processing (Liu, 2008 ; Liu et al. 2017 ). The MDD has been established as an efficient metric for measuring syntactic complexity (Jiang, Quyang, and Liu, 2019 ; Li and Yan, 2021 ). To calculate the MDD, the position numbers of the governor and dependent are subtracted, assuming that words in a sentence are assigned in a linear order, such as W1 … Wi … Wn. In any dependency relationship between words Wa and Wb, Wa is the governor and Wb is the dependent. The MDD of the entire sentence was obtained by taking the absolute value of governor – dependent:

MDD = \(\frac{1}{n}{\sum }_{i=1}^{n}|{\rm{D}}{{\rm{D}}}_{i}|\)

In this formula, \(n\) represents the number of words in the sentence, and \({DD}i\) is the dependency distance of the \({i}^{{th}}\) dependency relationship of a sentence. Building on this, the annotation of sentence ‘Mary-ga-John-ni-keshigomu-o-watashita was [Mary- top -John- dat -eraser- acc -give- past] ’. The sentence’s MDD would be 2. Table 3 provides the CSV file as a prompt for GPT 4.

Cohesion (semantic similarity) and content elaboration aim to capture the ideas presented in test taker’s essays. Cohesion was assessed using three measures: Synonym overlap/paragraph (topic), Synonym overlap/paragraph (keywords), and word2vec cosine similarity. Content elaboration and development were measured as the number of metadiscourse markers (type)/number of words. To capture content closely, this study proposed a novel-distance based representation, by encoding the cosine distance between the essay (by learner) and essay task’s (topic and keyword) i -vectors. The learner’s essay is decoded into a word sequence, and aligned to the essay task’ topic and keyword for log-likelihood measurement. The cosine distance reveals the content elaboration score in the leaners’ essay. The mathematical equation of cosine similarity between target-reference vectors is shown in (11), assuming there are i essays and ( L i , …. L n ) and ( N i , …. N n ) are the vectors representing the learner and task’s topic and keyword respectively. The content elaboration distance between L i and N i was calculated as follows:

\(\cos \left(\theta \right)=\frac{{\rm{L}}\,\cdot\, {\rm{N}}}{\left|{\rm{L}}\right|{\rm{|N|}}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}{N}_{i}}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{N}_{i}^{2}}}\)

A high similarity value indicates a low difference between the two recognition outcomes, which in turn suggests a high level of proficiency in content elaboration.

To evaluate the effectiveness of the proposed measures in distinguishing different proficiency levels among nonnative Japanese speakers’ writing, we conducted a multi-faceted Rasch measurement analysis (Linacre, 1994 ). This approach applies measurement models to thoroughly analyze various factors that can influence test outcomes, including test takers’ proficiency, item difficulty, and rater severity, among others. The underlying principles and functionality of multi-faceted Rasch measurement are illustrated in (12).

\(\log \left(\frac{{P}_{{nijk}}}{{P}_{{nij}(k-1)}}\right)={B}_{n}-{D}_{i}-{C}_{j}-{F}_{k}\)

(12) defines the logarithmic transformation of the probability ratio ( P nijk /P nij(k-1) )) as a function of multiple parameters. Here, n represents the test taker, i denotes a writing proficiency measure, j corresponds to the human rater, and k represents the proficiency score. The parameter B n signifies the proficiency level of test taker n (where n ranges from 1 to N). D j represents the difficulty parameter of test item i (where i ranges from 1 to L), while C j represents the severity of rater j (where j ranges from 1 to J). Additionally, F k represents the step difficulty for a test taker to move from score ‘k-1’ to k . P nijk refers to the probability of rater j assigning score k to test taker n for test item i . P nij(k-1) represents the likelihood of test taker n being assigned score ‘k-1’ by rater j for test item i . Each facet within the test is treated as an independent parameter and estimated within the same reference framework. To evaluate the consistency of scores obtained through both human and computer analysis, we utilized the Infit mean-square statistic. This statistic is a chi-square measure divided by the degrees of freedom and is weighted with information. It demonstrates higher sensitivity to unexpected patterns in responses to items near a person’s proficiency level (Linacre, 2002 ). Fit statistics are assessed based on predefined thresholds for acceptable fit. For the Infit MNSQ, which has a mean of 1.00, different thresholds have been suggested. Some propose stricter thresholds ranging from 0.7 to 1.3 (Bond et al. 2021 ), while others suggest more lenient thresholds ranging from 0.5 to 1.5 (Eckes, 2009 ). In this study, we adopted the criterion of 0.70–1.30 for the Infit MNSQ.

Moving forward, we can now proceed to assess the effectiveness of the 16 proposed measures based on five criteria for accurately distinguishing various levels of writing proficiency among non-native Japanese speakers. To conduct this evaluation, we utilized the development dataset from the I-JAS corpus, as described in Section Dataset . Table 4 provides a measurement report that presents the performance details of the 14 metrics under consideration. The measure separation was found to be 4.02, indicating a clear differentiation among the measures. The reliability index for the measure separation was 0.891, suggesting consistency in the measurement. Similarly, the person separation reliability index was 0.802, indicating the accuracy of the assessment in distinguishing between individuals. All 16 measures demonstrated Infit mean squares within a reasonable range, ranging from 0.76 to 1.28. The Synonym overlap/paragraph (topic) measure exhibited a relatively high outfit mean square of 1.46, although the Infit mean square falls within an acceptable range. The standard error for the measures ranged from 0.13 to 0.28, indicating the precision of the estimates.

Table 5 further illustrated the weights assigned to different linguistic measures for score prediction, with higher weights indicating stronger correlations between those measures and higher scores. Specifically, the following measures exhibited higher weights compared to others: moving average type token ratio per essay has a weight of 0.0391. Mean dependency distance had a weight of 0.0388. Mean length of clause, calculated by dividing the number of words by the number of clauses, had a weight of 0.0374. Complex nominals per T-unit, calculated by dividing the number of complex nominals by the number of T-units, had a weight of 0.0379. Coordinate phrases rate, calculated by dividing the number of coordinate phrases by the number of clauses, had a weight of 0.0325. Grammatical error rate, representing the number of errors per essay, had a weight of 0.0322.

Criteria (output indicator)

The criteria used to evaluate the writing ability in this study were based on CEFR, which follows a six-point scale ranging from A1 to C2. To assess the quality of Japanese writing, the scoring criteria from Table 6 were utilized. These criteria were derived from the IELTS writing standards and served as assessment guidelines and prompts for the written output.

A prompt is a question or detailed instruction that is provided to the model to obtain a proper response. After several pilot experiments, we decided to provide the measures (Section Measures of writing proficiency for nonnative Japanese ) as the input prompt and use the criteria (Section Criteria (output indicator) ) as the output indicator. Regarding the prompt language, considering that the LLM was tasked with rating Japanese essays, would prompt in Japanese works better Footnote 5 ? We conducted experiments comparing the performance of GPT-4 using both English and Japanese prompts. Additionally, we utilized the Japanese local model OCLL with Japanese prompts. Multiple trials were conducted using the same sample. Regardless of the prompt language used, we consistently obtained the same grading results with GPT-4, which assigned a grade of B1 to the writing sample. This suggested that GPT-4 is reliable and capable of producing consistent ratings regardless of the prompt language. On the other hand, when we used Japanese prompts with the Japanese local model “OCLL”, we encountered inconsistent grading results. Out of 10 attempts with OCLL, only 6 yielded consistent grading results (B1), while the remaining 4 showed different outcomes, including A1 and B2 grades. These findings indicated that the language of the prompt was not the determining factor for reliable AES. Instead, the size of the training data and the model parameters played crucial roles in achieving consistent and reliable AES results for the language model.

The following is the utilized prompt, which details all measures and requires the LLM to score the essays using holistic and trait scores.

Please evaluate Japanese essays written by Japanese learners and assign a score to each essay on a six-point scale, ranging from A1, A2, B1, B2, C1 to C2. Additionally, please provide trait scores and display the calculation process for each trait score. The scoring should be based on the following criteria:

Moving average type-token ratio.

Number of lexical words (token) divided by the total number of words per essay.

Number of sophisticated word types divided by the total number of words per essay.

Mean length of clause.

Verb phrases per T-unit.

Clauses per T-unit.

Dependent clauses per T-unit.

Complex nominals per clause.

Adverbial clauses per clause.

Coordinate phrases per clause.

Mean dependency distance.

Synonym overlap paragraph (topic and keywords).

Word2vec cosine similarity.

Connectives per essay.

Conjunctions per essay.

Number of metadiscourse markers (types) divided by the total number of words.

Number of errors per essay.

Japanese essay text

出かける前に二人が地図を見ている間に、サンドイッチを入れたバスケットに犬が入ってしまいました。それに気づかずに二人は楽しそうに出かけて行きました。やがて突然犬がバスケットから飛び出し、二人は驚きました。バスケット の 中を見ると、食べ物はすべて犬に食べられていて、二人は困ってしまいました。(ID_JJJ01_SW1)

The score of the example above was B1. Figure 3 provides an example of holistic and trait scores provided by GPT-4 (with a prompt indicating all measures) via Bing Footnote 6 .

figure 3

Example of GPT-4 AES and feedback (with a prompt indicating all measures).

Statistical analysis

The aim of this study is to investigate the potential use of LLM for nonnative Japanese AES. It seeks to compare the scoring outcomes obtained from feature-based AES tools, which rely on conventional machine learning technology (i.e. Jess, JWriter), with those generated by AI-driven AES tools utilizing deep learning technology (BERT, GPT, OCLL). To assess the reliability of a computer-assisted annotation tool, the study initially established human-human agreement as the benchmark measure. Subsequently, the performance of the LLM-based method was evaluated by comparing it to human-human agreement.

To assess annotation agreement, the study employed standard measures such as precision, recall, and F-score (Brants 2000 ; Lu 2010 ), along with the quadratically weighted kappa (QWK) to evaluate the consistency and agreement in the annotation process. Assume A and B represent human annotators. When comparing the annotations of the two annotators, the following results are obtained. The evaluation of precision, recall, and F-score metrics was illustrated in equations (13) to (15).



The F-score is the harmonic mean of recall and precision:

\({\rm{F}}-{\rm{score}}=\frac{2* ({\rm{Precision}}* {\rm{Recall}})}{{\rm{Precision}}+{\rm{Recall}}}\)

The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero.

In accordance with Taghipour and Ng ( 2016 ), the calculation of QWK involves two steps:

Step 1: Construct a weight matrix W as follows:


i represents the annotation made by the tool, while j represents the annotation made by a human rater. N denotes the total number of possible annotations. Matrix O is subsequently computed, where O_( i, j ) represents the count of data annotated by the tool ( i ) and the human annotator ( j ). On the other hand, E refers to the expected count matrix, which undergoes normalization to ensure that the sum of elements in E matches the sum of elements in O.

Step 2: With matrices O and E, the QWK is obtained as follows:

K = 1- \(\frac{\sum i,j{W}_{i,j}\,{O}_{i,j}}{\sum i,j{W}_{i,j}\,{E}_{i,j}}\)

The value of the quadratic weighted kappa increases as the level of agreement improves. Further, to assess the accuracy of LLM scoring, the proportional reductive mean square error (PRMSE) was employed. The PRMSE approach takes into account the variability observed in human ratings to estimate the rater error, which is then subtracted from the variance of the human labels. This calculation provides an overall measure of agreement between the automated scores and true scores (Haberman et al. 2015 ; Loukina et al. 2020 ; Taghipour and Ng, 2016 ). The computation of PRMSE involves the following steps:

Step 1: Calculate the mean squared errors (MSEs) for the scoring outcomes of the computer-assisted tool (MSE tool) and the human scoring outcomes (MSE human).

Step 2: Determine the PRMSE by comparing the MSE of the computer-assisted tool (MSE tool) with the MSE from human raters (MSE human), using the following formula:

\({\rm{PRMSE}}=1-\frac{({\rm{MSE}}\,{\rm{tool}})\,}{({\rm{MSE}}\,{\rm{human}})\,}=1-\,\frac{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-{\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-\hat{{\rm{y}}})}^{2}}\)

In the numerator, ŷi represents the scoring outcome predicted by a specific LLM-driven AES system for a given sample. The term y i − ŷ i represents the difference between this predicted outcome and the mean value of all LLM-driven AES systems’ scoring outcomes. It quantifies the deviation of the specific LLM-driven AES system’s prediction from the average prediction of all LLM-driven AES systems. In the denominator, y i − ŷ represents the difference between the scoring outcome provided by a specific human rater for a given sample and the mean value of all human raters’ scoring outcomes. It measures the discrepancy between the specific human rater’s score and the average score given by all human raters. The PRMSE is then calculated by subtracting the ratio of the MSE tool to the MSE human from 1. PRMSE falls within the range of 0 to 1, with larger values indicating reduced errors in LLM’s scoring compared to those of human raters. In other words, a higher PRMSE implies that LLM’s scoring demonstrates greater accuracy in predicting the true scores (Loukina et al. 2020 ). The interpretation of kappa values, ranging from 0 to 1, is based on the work of Landis and Koch ( 1977 ). Specifically, the following categories are assigned to different ranges of kappa values: −1 indicates complete inconsistency, 0 indicates random agreement, 0.0 ~ 0.20 indicates extremely low level of agreement (slight), 0.21 ~ 0.40 indicates moderate level of agreement (fair), 0.41 ~ 0.60 indicates medium level of agreement (moderate), 0.61 ~ 0.80 indicates high level of agreement (substantial), 0.81 ~ 1 indicates almost perfect level of agreement. All statistical analyses were executed using Python script.

Results and discussion

Annotation reliability of the llm.

This section focuses on assessing the reliability of the LLM’s annotation and scoring capabilities. To evaluate the reliability, several tests were conducted simultaneously, aiming to achieve the following objectives:

Assess the LLM’s ability to differentiate between test takers with varying levels of oral proficiency.

Determine the level of agreement between the annotations and scoring performed by the LLM and those done by human raters.

The evaluation of the results encompassed several metrics, including: precision, recall, F-Score, quadratically-weighted kappa, proportional reduction of mean squared error, Pearson correlation, and multi-faceted Rasch measurement.

Inter-annotator agreement (human–human annotator agreement)

We started with an agreement test of the two human annotators. Two trained annotators were recruited to determine the writing task data measures. A total of 714 scripts, as the test data, was utilized. Each analysis lasted 300–360 min. Inter-annotator agreement was evaluated using the standard measures of precision, recall, and F-score and QWK. Table 7 presents the inter-annotator agreement for the various indicators. As shown, the inter-annotator agreement was fairly high, with F-scores ranging from 1.0 for sentence and word number to 0.666 for grammatical errors.

The findings from the QWK analysis provided further confirmation of the inter-annotator agreement. The QWK values covered a range from 0.950 ( p  = 0.000) for sentence and word number to 0.695 for synonym overlap number (keyword) and grammatical errors ( p  = 0.001).

Agreement of annotation outcomes between human and LLM

To evaluate the consistency between human annotators and LLM annotators (BERT, GPT, OCLL) across the indices, the same test was conducted. The results of the inter-annotator agreement (F-score) between LLM and human annotation are provided in Appendix B-D. The F-scores ranged from 0.706 for Grammatical error # for OCLL-human to a perfect 1.000 for GPT-human, for sentences, clauses, T-units, and words. These findings were further supported by the QWK analysis, which showed agreement levels ranging from 0.807 ( p  = 0.001) for metadiscourse markers for OCLL-human to 0.962 for words ( p  = 0.000) for GPT-human. The findings demonstrated that the LLM annotation achieved a significant level of accuracy in identifying measurement units and counts.

Reliability of LLM-driven AES’s scoring and discriminating proficiency levels

This section examines the reliability of the LLM-driven AES scoring through a comparison of the scoring outcomes produced by human raters and the LLM ( Reliability of LLM-driven AES scoring ). It also assesses the effectiveness of the LLM-based AES system in differentiating participants with varying proficiency levels ( Reliability of LLM-driven AES discriminating proficiency levels ).

Reliability of LLM-driven AES scoring

Table 8 summarizes the QWK coefficient analysis between the scores computed by the human raters and the GPT-4 for the individual essays from I-JAS Footnote 7 . As shown, the QWK of all measures ranged from k  = 0.819 for lexical density (number of lexical words (tokens)/number of words per essay) to k  = 0.644 for word2vec cosine similarity. Table 9 further presents the Pearson correlations between the 16 writing proficiency measures scored by human raters and GPT 4 for the individual essays. The correlations ranged from 0.672 for syntactic complexity to 0.734 for grammatical accuracy. The correlations between the writing proficiency scores assigned by human raters and the BERT-based AES system were found to range from 0.661 for syntactic complexity to 0.713 for grammatical accuracy. The correlations between the writing proficiency scores given by human raters and the OCLL-based AES system ranged from 0.654 for cohesion to 0.721 for grammatical accuracy. These findings indicated an alignment between the assessments made by human raters and both the BERT-based and OCLL-based AES systems in terms of various aspects of writing proficiency.

Reliability of LLM-driven AES discriminating proficiency levels

After validating the reliability of the LLM’s annotation and scoring, the subsequent objective was to evaluate its ability to distinguish between various proficiency levels. For this analysis, a dataset of 686 individual essays was utilized. Table 10 presents a sample of the results, summarizing the means, standard deviations, and the outcomes of the one-way ANOVAs based on the measures assessed by the GPT-4 model. A post hoc multiple comparison test, specifically the Bonferroni test, was conducted to identify any potential differences between pairs of levels.

As the results reveal, seven measures presented linear upward or downward progress across the three proficiency levels. These were marked in bold in Table 10 and comprise one measure of lexical richness, i.e. MATTR (lexical diversity); four measures of syntactic complexity, i.e. MDD (mean dependency distance), MLC (mean length of clause), CNT (complex nominals per T-unit), CPC (coordinate phrases rate); one cohesion measure, i.e. word2vec cosine similarity and GER (grammatical error rate). Regarding the ability of the sixteen measures to distinguish adjacent proficiency levels, the Bonferroni tests indicated that statistically significant differences exist between the primary level and the intermediate level for MLC and GER. One measure of lexical richness, namely LD, along with three measures of syntactic complexity (VPT, CT, DCT, ACC), two measures of cohesion (SOPT, SOPK), and one measure of content elaboration (IMM), exhibited statistically significant differences between proficiency levels. However, these differences did not demonstrate a linear progression between adjacent proficiency levels. No significant difference was observed in lexical sophistication between proficiency levels.

To summarize, our study aimed to evaluate the reliability and differentiation capabilities of the LLM-driven AES method. For the first objective, we assessed the LLM’s ability to differentiate between test takers with varying levels of oral proficiency using precision, recall, F-Score, and quadratically-weighted kappa. Regarding the second objective, we compared the scoring outcomes generated by human raters and the LLM to determine the level of agreement. We employed quadratically-weighted kappa and Pearson correlations to compare the 16 writing proficiency measures for the individual essays. The results confirmed the feasibility of using the LLM for annotation and scoring in AES for nonnative Japanese. As a result, Research Question 1 has been addressed.

Comparison of BERT-, GPT-, OCLL-based AES, and linguistic-feature-based computation methods

This section aims to compare the effectiveness of five AES methods for nonnative Japanese writing, i.e. LLM-driven approaches utilizing BERT, GPT, and OCLL, linguistic feature-based approaches using Jess and JWriter. The comparison was conducted by comparing the ratings obtained from each approach with human ratings. All ratings were derived from the dataset introduced in Dataset . To facilitate the comparison, the agreement between the automated methods and human ratings was assessed using QWK and PRMSE. The performance of each approach was summarized in Table 11 .

The QWK coefficient values indicate that LLMs (GPT, BERT, OCLL) and human rating outcomes demonstrated higher agreement compared to feature-based AES methods (Jess and JWriter) in assessing writing proficiency criteria, including lexical richness, syntactic complexity, content, and grammatical accuracy. Among the LLMs, the GPT-4 driven AES and human rating outcomes showed the highest agreement in all criteria, except for syntactic complexity. The PRMSE values suggest that the GPT-based method outperformed linguistic feature-based methods and other LLM-based approaches. Moreover, an interesting finding emerged during the study: the agreement coefficient between GPT-4 and human scoring was even higher than the agreement between different human raters themselves. This discovery highlights the advantage of GPT-based AES over human rating. Ratings involve a series of processes, including reading the learners’ writing, evaluating the content and language, and assigning scores. Within this chain of processes, various biases can be introduced, stemming from factors such as rater biases, test design, and rating scales. These biases can impact the consistency and objectivity of human ratings. GPT-based AES may benefit from its ability to apply consistent and objective evaluation criteria. By prompting the GPT model with detailed writing scoring rubrics and linguistic features, potential biases in human ratings can be mitigated. The model follows a predefined set of guidelines and does not possess the same subjective biases that human raters may exhibit. This standardization in the evaluation process contributes to the higher agreement observed between GPT-4 and human scoring. Section Prompt strategy of the study delves further into the role of prompts in the application of LLMs to AES. It explores how the choice and implementation of prompts can impact the performance and reliability of LLM-based AES methods. Furthermore, it is important to acknowledge the strengths of the local model, i.e. the Japanese local model OCLL, which excels in processing certain idiomatic expressions. Nevertheless, our analysis indicated that GPT-4 surpasses local models in AES. This superior performance can be attributed to the larger parameter size of GPT-4, estimated to be between 500 billion and 1 trillion, which exceeds the sizes of both BERT and the local model OCLL.

Prompt strategy

In the context of prompt strategy, Mizumoto and Eguchi ( 2023 ) conducted a study where they applied the GPT-3 model to automatically score English essays in the TOEFL test. They found that the accuracy of the GPT model alone was moderate to fair. However, when they incorporated linguistic measures such as cohesion, syntactic complexity, and lexical features alongside the GPT model, the accuracy significantly improved. This highlights the importance of prompt engineering and providing the model with specific instructions to enhance its performance. In this study, a similar approach was taken to optimize the performance of LLMs. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. Model 1 was used as the baseline, representing GPT-4 without any additional prompting. Model 2, on the other hand, involved GPT-4 prompted with 16 measures that included scoring criteria, efficient linguistic features for writing assessment, and detailed measurement units and calculation formulas. The remaining models (Models 3 to 18) utilized GPT-4 prompted with individual measures. The performance of these 18 different models was assessed using the output indicators described in Section Criteria (output indicator) . By comparing the performances of these models, the study aimed to understand the impact of prompt engineering on the accuracy and effectiveness of GPT-4 in AES tasks.

Based on the PRMSE scores presented in Fig. 4 , it was observed that Model 1, representing GPT-4 without any additional prompting, achieved a fair level of performance. However, Model 2, which utilized GPT-4 prompted with all measures, outperformed all other models in terms of PRMSE score, achieving a score of 0.681. These results indicate that the inclusion of specific measures and prompts significantly enhanced the performance of GPT-4 in AES. Among the measures, syntactic complexity was found to play a particularly significant role in improving the accuracy of GPT-4 in assessing writing quality. Following that, lexical diversity emerged as another important factor contributing to the model’s effectiveness. The study suggests that a well-prompted GPT-4 can serve as a valuable tool to support human assessors in evaluating writing quality. By utilizing GPT-4 as an automated scoring tool, the evaluation biases associated with human raters can be minimized. This has the potential to empower teachers by allowing them to focus on designing writing tasks and guiding writing strategies, while leveraging the capabilities of GPT-4 for efficient and reliable scoring.

figure 4

PRMSE scores of the 18 AES models.

This study aimed to investigate two main research questions: the feasibility of utilizing LLMs for AES and the impact of prompt engineering on the application of LLMs in AES.

To address the first objective, the study compared the effectiveness of five different models: GPT, BERT, the Japanese local LLM (OCLL), and two conventional machine learning-based AES tools (Jess and JWriter). The PRMSE values indicated that the GPT-4-based method outperformed other LLMs (BERT, OCLL) and linguistic feature-based computational methods (Jess and JWriter) across various writing proficiency criteria. Furthermore, the agreement coefficient between GPT-4 and human scoring surpassed the agreement among human raters themselves, highlighting the potential of using the GPT-4 tool to enhance AES by reducing biases and subjectivity, saving time, labor, and cost, and providing valuable feedback for self-study. Regarding the second goal, the role of prompt design was investigated by comparing 18 models, including a baseline model, a model prompted with all measures, and 16 models prompted with one measure at a time. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. The PRMSE scores of the models showed that GPT-4 prompted with all measures achieved the best performance, surpassing the baseline and other models.

In conclusion, this study has demonstrated the potential of LLMs in supporting human rating in assessments. By incorporating automation, we can save time and resources while reducing biases and subjectivity inherent in human rating processes. Automated language assessments offer the advantage of accessibility, providing equal opportunities and economic feasibility for individuals who lack access to traditional assessment centers or necessary resources. LLM-based language assessments provide valuable feedback and support to learners, aiding in the enhancement of their language proficiency and the achievement of their goals. This personalized feedback can cater to individual learner needs, facilitating a more tailored and effective language-learning experience.

There are three important areas that merit further exploration. First, prompt engineering requires attention to ensure optimal performance of LLM-based AES across different language types. This study revealed that GPT-4, when prompted with all measures, outperformed models prompted with fewer measures. Therefore, investigating and refining prompt strategies can enhance the effectiveness of LLMs in automated language assessments. Second, it is crucial to explore the application of LLMs in second-language assessment and learning for oral proficiency, as well as their potential in under-resourced languages. Recent advancements in self-supervised machine learning techniques have significantly improved automatic speech recognition (ASR) systems, opening up new possibilities for creating reliable ASR systems, particularly for under-resourced languages with limited data. However, challenges persist in the field of ASR. First, ASR assumes correct word pronunciation for automatic pronunciation evaluation, which proves challenging for learners in the early stages of language acquisition due to diverse accents influenced by their native languages. Accurately segmenting short words becomes problematic in such cases. Second, developing precise audio-text transcriptions for languages with non-native accented speech poses a formidable task. Last, assessing oral proficiency levels involves capturing various linguistic features, including fluency, pronunciation, accuracy, and complexity, which are not easily captured by current NLP technology.

Data availability

The dataset utilized was obtained from the International Corpus of Japanese as a Second Language (I-JAS). The data URLs: [ ].

J-CAT and TTBJ are two computerized adaptive tests used to assess Japanese language proficiency.

SPOT is a specific component of the TTBJ test.



The study utilized a prompt-based GPT-4 model, developed by OpenAI, which has an impressive architecture with 1.8 trillion parameters across 120 layers. GPT-4 was trained on a vast dataset of 13 trillion tokens, using two stages: initial training on internet text datasets to predict the next token, and subsequent fine-tuning through reinforcement learning from human feedback. . by Japanese Learning Dictionary Support Group 2015.

We express our sincere gratitude to the reviewer for bringing this matter to our attention.

On February 7, 2023, Microsoft began rolling out a major overhaul to Bing that included a new chatbot feature based on OpenAI’s GPT-4 (

Appendix E-F present the analysis results of the QWK coefficient between the scores computed by the human raters and the BERT, OCLL models.

Attali Y, Burstein J (2006) Automated essay scoring with e-rater® V.2. J. Technol., Learn. Assess., 4

Barkaoui K, Hadidi A (2020) Assessing Change in English Second Language Writing Performance (1st ed.). Routledge, New York.

Bentz C, Tatyana R, Koplenig A, Tanja S (2016) A comparison between morphological complexity. measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee

Bond TG, Yan Z, Heene M (2021) Applying the Rasch model: Fundamental measurement in the human sciences (4th ed). Routledge

Brants T (2000) Inter-annotator agreement for a German newspaper corpus. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, 31 May-2 June, European Language Resources Association

Brown TB, Mann B, Ryder N, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems, Online, 6–12 December, Curran Associates, Inc., Red Hook, NY

Burstein J (2003) The E-rater scoring engine: Automated essay scoring with natural language processing. In Shermis MD and Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Čech R, Miroslav K (2018) Morphological richness of text. In Masako F, Václav C (ed) Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature

Çöltekin Ç, Taraka, R (2018) Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs B, Christian B (ed), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland

Crossley SA, Cobb T, McNamara DS (2013) Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41:965–981.

Article   Google Scholar  

Crossley SA, McNamara DS (2016) Say more and be more coherent: How text elaboration and cohesion can increase writing quality. J. Writ. Res. 7:351–370

CyberAgent Inc (2023) Open-Calm series of Japanese language models. Retrieved from:

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, Minnesota, 2–7 June, pp. 4171–4186. Association for Computational Linguistics

Diez-Ortega M, Kyle K (2023) Measuring the development of lexical richness of L2 Spanish: a longitudinal learner corpus study. Studies in Second Language Acquisition 1-31

Eckes T (2009) On common ground? How raters perceive scoring criteria in oral proficiency testing. In Brown A, Hill K (ed) Language testing and evaluation 13: Tasks and criteria in performance assessment (pp. 43–73). Peter Lang Publishing

Elliot S (2003) IntelliMetric: from here to validity. In: Shermis MD, Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Google Scholar  

Engber CA (1995) The relationship of lexical proficiency to the quality of ESL compositions. J. Second Lang. Writ. 4:139–155

Garner J, Crossley SA, Kyle K (2019) N-gram measures and L2 writing proficiency. System 80:176–187.

Haberman SJ (2008) When can subscores have value? J. Educat. Behav. Stat., 33:204–229

Haberman SJ, Yao L, Sinharay S (2015) Prediction of true test scores from observed item scores and ancillary data. Brit. J. Math. Stat. Psychol. 68:363–385

Halliday MAK (1985) Spoken and Written Language. Deakin University Press, Melbourne, Australia

Hirao R, Arai M, Shimanaka H et al. (2020) Automated essay scoring system for nonnative Japanese learners. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 1250–1257. European Language Resources Association

Hunt KW (1966) Recent Measures in Syntactic Development. Elementary English, 43(7), 732–739.

Ishioka T (2001) About e-rater, a computer-based automatic scoring system for essays [Konpyūta ni yoru essei no jidō saiten shisutemu e − rater ni tsuite]. University Entrance Examination. Forum [Daigaku nyūshi fōramu] 24:71–76

Hochreiter S, Schmidhuber J (1997) Long short- term memory. Neural Comput. 9(8):1735–1780

Article   CAS   PubMed   Google Scholar  

Ishioka T, Kameda M (2006) Automated Japanese essay scoring system based on articles written by experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006, pp. 233-240. Association for Computational Linguistics, USA

Japan Foundation (2021) Retrieved from:

Jarvis S (2013a) Defining and measuring lexical diversity. In Jarvis S, Daller M (ed) Vocabulary knowledge: Human ratings and automated measures (Vol. 47, pp. 13–44). John Benjamins.

Jarvis S (2013b) Capturing the diversity in lexical diversity. Lang. Learn. 63:87–106.

Jiang J, Quyang J, Liu H (2019) Interlanguage: A perspective of quantitative linguistic typology. Lang. Sci. 74:85–97

Kim M, Crossley SA, Kyle K (2018) Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Mod. Lang. J. 102(1):120–141.

Kojima T, Gu S, Reid M et al. (2022) Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, New Orleans, LA, 29 November-1 December, Curran Associates, Inc., Red Hook, NY

Kyle K, Crossley SA (2015) Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Q 49:757–786

Kyle K, Crossley SA, Berger CM (2018) The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behav. Res. Methods 50:1030–1046.

Article   PubMed   Google Scholar  

Kyle K, Crossley SA, Jarvis S (2021) Assessing the validity of lexical diversity using direct judgements. Lang. Assess. Q. 18:154–170.

Landauer TK, Laham D, Foltz PW (2003) Automated essay scoring and annotation of essays with the Intelligent Essay Assessor. In Shermis MD, Burstein JC (ed), Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 159–174

Laufer B, Nation P (1995) Vocabulary size and use: Lexical richness in L2 written production. Appl. Linguist. 16:307–322.

Lee J, Hasebe Y (2017) jWriter Learner Text Evaluator, URL:

Lee J, Kobayashi N, Sakai T, Sakota K (2015) A Comparison of SPOT and J-CAT Based on Test Analysis [Tesuto bunseki ni motozuku ‘SPOT’ to ‘J-CAT’ no hikaku]. Research on the Acquisition of Second Language Japanese [Dainigengo to shite no nihongo no shūtoku kenkyū] (18) 53–69

Li W, Yan J (2021) Probability distribution of dependency distance based on a Treebank of. Japanese EFL Learners’ Interlanguage. J. Quant. Linguist. 28(2):172–186.

Article   MathSciNet   Google Scholar  

Linacre JM (2002) Optimizing rating scale category effectiveness. J. Appl. Meas. 3(1):85–106

PubMed   Google Scholar  

Linacre JM (1994) Constructing measurement with a Many-Facet Rasch Model. In Wilson M (ed) Objective measurement: Theory into practice, Volume 2 (pp. 129–144). Norwood, NJ: Ablex

Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J. Cognitive Sci. 9:159–191

Liu H, Xu C, Liang J (2017) Dependency distance: A new perspective on syntactic patterns in natural languages. Phys. Life Rev. 21.

Loukina A, Madnani N, Cahill A, et al. (2020) Using PRMSE to evaluate automated scoring systems in the presence of label noise. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA → Online, 10 July, pp. 18–29. Association for Computational Linguistics

Lu X (2010) Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15:474–496

Lu X (2012) The relationship of lexical richness to the quality of ESL learners’ oral narratives. Mod. Lang. J. 96:190–208

Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Lang. Test. 34:493–511

Lu X, Hu R (2022) Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behav. Res. Method. 54:1444–1460.

Ministry of Health, Labor, and Welfare of Japan (2022) Retrieved from:

Mizumoto A, Eguchi M (2023) Exploring the potential of using an AI language model for automated essay scoring. Res. Methods Appl. Linguist. 3:100050

Okgetheng B, Takeuchi K (2024) Estimating Japanese Essay Grading Scores with Large Language Models. Proceedings of 30th Annual Conference of the Language Processing Society in Japan, March 2024

Ortega L (2015) Second language learning explained? SLA across 10 contemporary theories. In VanPatten B, Williams J (ed) Theories in Second Language Acquisition: An Introduction

Rae JW, Borgeaud S, Cai T, et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. ArXiv, abs/2112.11446

Read J (2000) Assessing vocabulary. Cambridge University Press.

Rudner LM, Liang T (2002) Automated Essay Scoring Using Bayes’ Theorem. J. Technol., Learning and Assessment, 1 (2)

Sakoda K, Hosoi Y (2020) Accuracy and complexity of Japanese Language usage by SLA learners in different learning environments based on the analysis of I-JAS, a learners’ corpus of Japanese as L2. Math. Linguist. 32(7):403–418.

Suzuki N (1999) Summary of survey results regarding comprehensive essay questions. Final report of “Joint Research on Comprehensive Examinations for the Aim of Evaluating Applicability to Each Specialized Field of Universities” for 1996-2000 [shōronbun sōgō mondai ni kansuru chōsa kekka no gaiyō. Heisei 8 - Heisei 12-nendo daigaku no kaku senmon bun’ya e no tekisei no hyōka o mokuteki to suru sōgō shiken no arikata ni kansuru kyōdō kenkyū’ saishū hōkoku-sho]. University Entrance Examination Section Center Research and Development Department [Daigaku nyūshi sentā kenkyū kaihatsubu], 21–32

Taghipour K, Ng HT (2016) A neural approach to automated essay scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 1–5 November, pp. 1882–1891. Association for Computational Linguistics

Takeuchi K, Ohno M, Motojin K, Taguchi M, Inada Y, Iizuka M, Abo T, Ueda H (2021) Development of essay scoring methods based on reference texts with construction of research-available Japanese essay data. In IPSJ J 62(9):1586–1604

Ure J (1971) Lexical density: A computational technique and some findings. In Coultard M (ed) Talking about Text. English Language Research, University of Birmingham, Birmingham, England

Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need. In Advances in Neural Information Processing Systems, Long Beach, CA, 4–7 December, pp. 5998–6008, Curran Associates, Inc., Red Hook, NY

Watanabe H, Taira Y, Inoue Y (1988) Analysis of essay evaluation data [Shōronbun hyōka dēta no kaiseki]. Bulletin of the Faculty of Education, University of Tokyo [Tōkyōdaigaku kyōiku gakubu kiyō], Vol. 28, 143–164

Yao S, Yu D, Zhao J, et al. (2023) Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36

Zenker F, Kyle K (2021) Investigating minimum text lengths for lexical diversity indices. Assess. Writ. 47:100505.

Zhang Y, Warstadt A, Li X, et al. (2021) When do you need billions of words of pretraining data? Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, pp. 1112-1125. Association for Computational Linguistics.

Download references

This research was funded by National Foundation of Social Sciences (22BYY186) to Wenchao Li.

Author information

Authors and affiliations.

Department of Japanese Studies, Zhejiang University, Hangzhou, China

Department of Linguistics and Applied Linguistics, Zhejiang University, Hangzhou, China

You can also search for this author in PubMed   Google Scholar


Wenchao Li is in charge of conceptualization, validation, formal analysis, investigation, data curation, visualization and writing the draft. Haitao Liu is in charge of supervision.

Corresponding author

Correspondence to Wenchao Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental material file #1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

Reprints and permissions

About this article

Cite this article.

Li, W., Liu, H. Applying large language models for automated essay scoring for non-native Japanese. Humanit Soc Sci Commun 11 , 723 (2024).

Download citation

Received : 02 February 2024

Accepted : 16 May 2024

Published : 03 June 2024


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

how many sentences in a paragraph for essay


  1. Paragraphs

    Paragraphs are the building blocks of papers. Many students define paragraphs in terms of length: a paragraph is a group of at least five sentences, a paragraph is half a page long, etc. In reality, though, the unity and coherence of ideas among sentences is what constitutes a paragraph. A paragraph is defined as "a group of sentences or a ...

  2. 11 Rules for Essay Paragraph Structure (with Examples)

    Learn how to write perfect paragraphs for essays with four to seven sentences each. Find out the key elements, tips and mistakes to avoid for each paragraph.

  3. PDF Strategies for Essay Writing

    The Anatomy of a Body Paragraph . . . . . . 18-20 Transitions . . . . . . . . . 21-23 Tips for Organizing Your Essay . . . . . . 24-26 Counterargument ... answer to that question will be your essay's thesis. You may have many questions as you consider a source or set of sources, but not all of your questions will form the basis of a strong ...

  4. Academic Paragraph Structure

    Learn how to write strong paragraphs for academic papers with this step-by-step guide and examples. Find out how to identify the paragraph's purpose, show its relevance, give evidence, explain or interpret it, and conclude it.

  5. Paragraphing

    Learn how to write paragraphs with one main idea and three to five sentences each. Find out how to balance and organize your paragraphs for different types of papers.

  6. How Many Sentences in a Paragraph

    Learn how to write paragraphs for essays with different lengths and styles. Find out how to start a paragraph with a topic sentence, how to use dialogue, and how to revise paragraphs.

  7. The Beginner's Guide to Writing an Essay

    For a high school essay, this could be just three paragraphs, but for a graduate school essay of 6,000 words, the body could take up 8-10 pages. Paragraph structure. To give your essay a clear structure, it is important to organize it into paragraphs. Each paragraph should be centered around one main point or idea.

  8. How Many Sentences are in a Paragraph?

    Learn how to write paragraphs of different lengths and styles for various types of writing, such as essays, research papers, blog posts, and creative writing. Find out the definition, rules, and tips for paragraph structure and format.

  9. How Many Sentences Are in a Paragraph?

    Learn how many sentences are in a paragraph for essay and other forms of writing. Find out how paragraph breaks are used for different purposes in fiction and nonfiction.

  10. Writing academically: Paragraph structure

    The topic sentence (Point) This should appear early in the paragraph and is often, but not always, the first sentence. It should clearly state the main point that you are making in the paragraph.When you are planning essays, writing down a list of your topic sentences is an excellent way to check that your argument flows well from one point to the next.

  11. How Many Sentences Are in a Paragraph?

    Figuring out how many sentences are in a paragraph can be a stressful process, especially when you consider the answer can vary. Learn what you need to know for your writing here. ... If you're working with the classic five-paragraph essay, you can aim for the typical three to five sentences per paragraph. For other papers, ...

  12. Paragraph Components, Length & Examples

    The ideal length of a paragraph is 100-200 words with a maximum of five sentences. This allows the readers to get a good grasp of what the topic is about. An ideal paragraph consists of an ...

  13. How Many Paragraphs Should an Essay Have?

    Learn how to write paragraphs for essays with this guide. Find out how many paragraphs you need, when to start a new one, and how to use transitional words and phrases.

  14. How Many Sentences in a Paragraph?

    I especially like Austin Chadd's 2-sentence paragraph, telling us that every paragraph should be at least 4-5 sentences long. And TFP's, right at the top, that uses a 1-sentence paragraph to propose that every paragraph needs 2-3 sentences. A paragraph should have as many words and sentences as it takes to express its concept or idea.

  15. On Paragraphs

    Learn how to write effective paragraphs with one idea, unity, coherence, topic sentence, and adequate development. Find out when to start a new paragraph and how to use transitions and signposts.

  16. How Long is an Essay? Guidelines for Different Types of Essay

    Learn how long an essay should be depending on your academic level and subject. Find out how to use length as a guide to topic and complexity, and how to avoid going under or over the suggested word count.

  17. How to Write an Introduction Paragraph in 3 Steps

    Intro Paragraph Part 3: The Thesis. The final key part of how to write an intro paragraph is the thesis statement. The thesis statement is the backbone of your introduction: it conveys your argument or point of view on your topic in a clear, concise, and compelling way. The thesis is usually the last sentence of your intro paragraph.

  18. How to Craft a Stellar 5-Paragraph Essay: A Step-by-Step Guide

    5-Paragraph Essay FAQs How many words should a 5-paragraph essay be? The length of a 5-paragraph essay can vary depending on the purpose and complexity of the topic, as well as the intended audience. However, a typical 5-paragraph essay ranges from 250 to 500 words. Here's a breakdown: Introduction: 50-100 words. This includes a brief ...

  19. Applying large language models for automated essay scoring for non

    These include the Intelligent Essay Assessor (Landauer et al. 2003), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006; Burstein, 2003), MyAccess with the InterlliMetric ...

  20. How to Write an Essay Introduction

    Step 1: Hook your reader. Step 2: Give background information. Step 3: Present your thesis statement. Step 4: Map your essay's structure. Step 5: Check and revise. More examples of essay introductions. Other interesting articles. Frequently asked questions about the essay introduction.