Yes, Java helped me name Lucas

posted by Roberto Cortez on
tags: ,

LucasHaving a child is one of the most memorable moment you can experience. Parents have a huge responsibility with this new life and it starts right away when they have to choose the name. Our name is something that we keep for life (most of the times), so it needs some careful consideration. With literally thousands of names to choose from, how about using some Java technology to help us?

Yes, I wrote a small Java application to help me choose my baby name!

Requirements

First of all, we have to define some rules!

Basic Rules

  • A short name (but not too short)
  • In the first half of the alphabet
  • Without special characters

A short name, so it easy to call him. In the first half of the alphabet, because in Portugal we have this stupid rule where kids get seated in the classrooms in alphabetic order, so letters in the end of the alphabet get to sit in the end of the room! And finally, a name without special characters (we have a few in Portugal), to be easier for foreign people to use.

Advanced Rules

  • Exists in at least three languages (Portuguese, Spanish and English)
  • Sounds and Writes the same in all three languages

To cover the fact that I was born in a Spanish speaking country, that we live in Portugal and English for the globalization.

Constraints

There are also some constraints. In Portugal, you cannot use any name you want. You need to pick the name from an approved list of names. Of course, this is a comprehensive list of names that covers all common names. It is mostly used to avoid giving a stupid name to your kid. If your curious about it, check the list here.

Implementation

Basic Rules are Easy

After grabbing the data in the file and use some Java 8 Streams:

(Yes, I’m lazy with regular expressions)

Advanced Rules are more Interesting

Beider-Morse Phonetic Matching

When searching for a way to determine if a word or a name exists in another language and sounds the same, I came across with the Beider-Morse Phonetic Matching.

The main objective of Beider-Morse Phonetic Matching consists in recognizing that two words are written in a different way actually can be phonetically equivalent, that is, they both can sound alike. But unlike soundex methods, the “sounds-alike” test is based not only on the spelling but on linguistic properties of various languages.

How?
It tries to guess the language of the word following specific language rules and then calculates a phonetic value for that word. For instance:

  • tsch, final mann and witz are specifically German
  • final and initial cs and zs are necessarily Hungarian
  • cz, cy, initial rz and wl, final cki, letters ś, ł and ż can be only Polish

And then a phonetic value is calculated:

OriginalPhonetic ValueExample
tztsFitzgerald
cscircle
cckssuccess,accent
gh(g|f|w)burgh|tough|bough
knnknight
mcmakMcDonald

 

The Commons Coded project has an implementation of the Beider-Morse Phonetic Matching algorithm. Try it out and play with it.

Rosette API

The Rosette API is a Text Analysis Toolkit, that provides multiple services to perform text analysis. They also have a Name Translation service with a REST endpoint that you can use to feed in names and the desired language and return the right translation with a confidence score. Their API is useful to double check results obtained with the Beider-Morse Phonetic Matching.

They have fantastic support, providing libraries to integrate with their API’s in multiple languages and also a lot of samples you can use. Check their Github repo here.

behindthename.com

The Behind the Name website provided with the etymology and history of first names, plus a comprehensive list of names and what languages do they exist. On top of that, they also provide an API to check that information, so you can use it to triple check the results from Beider-Morse Phonetic Matching and the Rosette API.

Lucas Behind the Name

Adding it all Together

Lets just add this to our previous Java 8 Streams filter:

Final Results

After all, rules are applied and filtered the initial list, only 2 names remained. One is the obvious Lucas and the other was David. So, both these names exist, are written and are pronounced in the same way for Portuguese, English, and Spanish.

Proof it works?

Well, now I’ve just go to any random Starbucks and something with the name Lucas and confirm that they got it right. So far so good!

Lucas Proof

If you find this interesting, I even published the code in a Github repo. Check it out.

Note for Lucas: Lucas if you read this when your older, please excuse me for having such a geeky father.

Comments ( 12 )

  1. ReplyAshey

    Loved it ?

    Do you have any doubt he is going on the geek side as well? I don’t and I kind of like it ?

    Signed: Lucas mom.

  2. Replyharaujo

    My dear friend, this was your best post since ever. Great story!

    Abraço
    Hugo

  3. Replyfbatista

    Muito bom, se bem que assusta-me um bocado que apenas 2 nomes tenham passado ‘a final… Assim a vossa escolha ficou um bocado limitada!

    BTW, os ingleses nao tem tido problemas com Fabio ate’ ‘a data, e o F pos-me sempre nas “filas da frente”. Se calhar foi eliminado por causa do acento ?

    • ReplyRoberto Cortez

      Até resultou bem estar limitado. O que interessava era a Mãe gostar do nome. Se não, isto não tinha servido para nada 😛

      Mas sim, Fábio foi eliminado por causa do acento. Lamento 🙁

  4. ReplyEddie

    Robertinho axo que temos de ir mais vezes para os copos!!!
    GJ!

    • ReplyRoberto Cortez

      Ui… já não tenho tempo para isso. Agora tenho um despertador certinho às 8 da manhã que não falha 😀

  5. ReplyJavier

    Very interesting. We have arrived to nearly the same conclusion about how to call our son, and your son. We where thinking about two important rules for his name:
    – Sort name.
    – Easy to be pronunced in as much languages as possible.

    Without making this coding process, and only using brainstorming, we arrived to two possible names: Eric and Luca.

    I’m spanish and living in spain. But I have spent some time in Italy, So we decided to pick up Luca (that it is italian name for Lucas). It is easy to be pronounced correctly in all the languages that I’ve know. And it is written very similar in a lot of languages.

    The dark side, is that, here in Spain, when people listen his name, trend to add a final “s”. So may times we need to specify that his name is without the final “s”.

    • ReplyRoberto Cortez

      Hi Javier,

      Thank you for your comment. You got a cool story too 🙂

      Did you check https://www.behindthename.com/name/luca-1? 🙂

      • ReplyJavier

        Hi Roberto,

        Before I read your post, I do not know that this site exists. But when I read your post. I just take a glance to it, and I found it very interesting. Without any doubt, It confirms me that it was a good name decision.

Leave a Reply to haraujo Cancel reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>