How do we learn a new language?  

We always complain that children learn much faster than adults only because young minds have a more dynamic structure. It is true. However, another very important factor that we need to take into consideration is that children have unparalleled environmental advantages when learning languages that most adults don’t have.

Initial Observation

Adults struggle in learning English. Compared to children, they learn less efficiently while more painfully.


By simulating an immersive english learning environment in VR, we believe the adult will be able to learn English more efficiently and playfully.




1 Prototype Designer (Me)

1 Developer

Time Frame

3 Weeks

My Role

User Research

Asset Creation

Unity Prototyping




Media Screen

In order to better understand the different learning environments between children and adults, I reviewed some current research studies. Here are some key findings that help me understand our target users.

What environmental advantages do children have in language learning over the adult?

#1 Immersion

  • Children have environmental advantages when learning language that most adults don’t have. They learn by being immersed in multilingual environments.

#2 Responsive Environment

  • Children learn through their responsive environment, i.e passively “absorbing” the language through context, rather than verb conjugation and exams.

#3 Fewer Inhibitions

  • Children have fewer inhibitions. It’s much easier to learn a language if you’re comfortable making mistakes and sounding foolish, a hurdle that makes most adults extremely anxious.


The next thing I did is to specify the pain points and current solutions towards the problems. I talked to four people about the difficulties they met in English learning and asked them to list out efficient methods they used when they learned english.

Why English learning a tough task for non-native speaker?

I also collected some methods that the users thought were useful on traditional learning platform. I then made a ranking according to their frequencies. Based on that, I asked another 5 people to rank those solutions again.

Survey: Rank the efficiency of these methods.


I then manually sort them into three categories in English learning: speaking, listening and reading. (It is interesting to find none of these methods is related to “writing” which is also an essential part in English learning. But I decided to ignore it right now, because keyboard is still being worked out in VR and not appropriate for all environments).


Prior Work in VR

In order to have a better sense of what interaction is helpful and to what extent is possible in VR, we then tried some VR products in the realm of language learning. Specifically we looked at Mondly VR and ImmerseMe.

Mondly VR

Mondly VR




What interaction methods are helpful and feasible in VR?

They all used contextual dialogue as the key method to teach language in VR scenario.

prior work.png

When it comes to creating a VR experience, the most important thing you need to impart to your player is a sense of being. They need to feel as if they really are on another world. After having a try on these two products, my teammate and I agreed that it definitely created an immersive environment in VR, but

#1 it still feels not so real

pro and con_img.png
  • Avatar’s feedback is not natural

    • lack of body movement

    • lack of sentiment (face expression, audio, etc)

  • As a task based game, feedback of “right” and “wrong” is delivered by user interface rather than the avatar

#2 not address on the section of “reading”

  • Only focus on speaking and listening

Design Decision

Inspired by the insights above, we decided to explore a better and more comprehensive immersive experience in VR by improving experience of contextual dialogue and introducing “flash card” to make up the gap of “reading”.


We then drafted out a storyboard that incorporated learning methods and techniques we wanted to use. The scenario starts from the player opens his eyes and finds he is in the cradle. Then he realizes he embodies in an infant body and a girl is standing in front of the cradle trying to teach him the word bear. After finishing this task, the user is allowed to browse the room. When he browsing the room, the room entered the vocabulary mode and everything is tagged up. User can choose to learn by wandering and enter a small game to test if they remember those words.


Task Flow Design

I created the task flow to make our design clear. Considering the limited time, we decided to build the three tasks with hatches to validate our design.


Asset Creation

Scene Design

I made ample use of Unity Asset Store for basic geometry, re-assigned the texture and material, and adjusted the lighting according to our topic.

Screen Shot 2018-12-08 at 00.55.45.png

Character Design

In order to deliver a fast prototype, I grabbed a basic geometry from Sketchfab. Then clean the model in Blender.


Body Movement

I used Mixamo to animate the avatar.



Point the bear

Point the bear






Unity Prototyping

Task #1: What’s a bear?

User learns what is the meaning of “bear” and how to pronounce it by interacting with avatar. Mike coded for this part.



The demo shows how the avatar implies and teaches the user “what is a bear” by humanoid cues including gesture, body movement and sentimental audio.


Task #2: Vocabulary Learning Mode

Demo 1

Users learns the word by wandering around the room and reading the words.


I used VRTK prefab to label the objects and assigned them with the color that is similar to the object so that it made a stronger connection.

Prototyping with VRTK prefab

Prototyping with VRTK prefab




Demo 2

To help building a greater connection between word and its true meaning, I used flash card to display further information including spelling, introduction, images or videos.


Task #3: Vocabulary Game Mode

I coded this game and integrated with Google VR SDK. User need to hit the word they heard. If they hit the right word, the score will increment; otherwise, the “wrong word” will jump, and the flash card(I only build the “bear” for this demo) of the “right word” will show up to make a further explanation.



Once we had the prototypes done it was time to see how they performed with people. I asked four friends for feedback with focus on following four aspects:


  • Can you understand what the avatar is conveying?


  • How much did you want to play with it?


  • Did you believe you are embodied in a foreign baby's body?


  • How comfortable do you find to speak and interact with the avatar?

And here are some highlights:

  • Contextual learning is told very helpful in learning work and dialogue

  • Animation of the avatar surprisingly attracts their attention most and almost everyone will do different gesture to try to trigger the avatar

  • It is more comfortable to speak in VR because “no one is judging you here”

  • They like the simplicity of the flash card and they thought it’s really easy to use.