Developping for the translation industry RSS 2.0



 Tuesday, 06 March 2012

Question Mark ?
Origin: When early scholars wrote in Latin, they would place the word questio – meaning “question” – at the end of a sentence to indicate a query. To conserve valuable space, writing it was soon shortened to qo, which caused another problem – readers might mistake it for the ending of a word. So they squashed the letters into a symbol: a lowercased q on top of an o. Over time the o shrank to a dot and the q to a squiggle, giving us our current question mark.

Exclamation Point !
Origin: Like the question mark, the exclamation point was invented by stacking letters. The mark comes from the Latin word io, meaning “exclamation of joy.” Written vertically, with the i above the o, it forms the exclamation point we use today.

Equal Sign =
Origin: Invented by Welsh mathematician Robert Recorde in 1557, with this rationale: “I will settle as I doe often in woorke use, a paire of paralleles, or Gmowe [i.e., twin] lines of one length, thus : , bicause noe 2 thynges, can be more equalle.” His equal signs were about five times as long as the current ones, and it took more than a century for his sign to be accepted over its rival: a strange curly symbol invented by Descartes.

Ampersand &
Origin: This symbol is stylized et, Latin for “and.” Although it was invented by the Roman scribe Marcus Tullius Tiro in the first century B.C., it didn’t get its strange name until centuries later. In the early 1800s, schoolchildren learned this symbol as the 27th letter of the alphabet: X, Y, Z, &. But the symbol had no name. So, they ended their ABCs with “and, per se, and” meaning “&, which means ‘and.’” This phrase was slurred into one garbled word that eventually caught on with everyone: ampersand.

Octothorpe #
Origin: The odd name for this ancient sign for numbering derives from thorpe, the Old Norse word for a village or farm that is often seen in British placenames. The symbol was originally used in mapmaking, representing a village surrounded by eight fields, so it was named the octothorp.

Dollar sign $
Origin: When the U.S. government begin issuing its own money in 1794, it used the common world currency – the peso – also called the Spanish dollar. The first American silver dollars were identical to Spanish pesos in weight and value, so they took the same written abbreviations: Ps. That evolved into a P with an s written right on top of it, and when people began to omit the circular part of the p, the sign simply became an S with a vertical line through it.

This comes from a book named “Uncle John’s Supremely Satisfying Bathroom Reader” http://www.neatorama.com/2007/07/09/the-origin-of-everyday-punctuation-marks/

 

Other posts:

Some funny cross cultural marketing and translation mistakes part 1 and part 2

.NET String Format Syntax Cheat Sheet

Google Translator has been Hacked

Information on The New Canadian Translation Standard

Tuesday, 06 March 2012 08:53:13 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
General
 Monday, 11 July 2011

See the first post in this serie here

Although cruel, cross cultural marketing mistakes are a humorous means of understanding the impact poor cultural awareness or translations can have on a product or company when selling abroad.

Enjoy!

1. When Kentucky Fried Chicken entered the Chinese market, to their horror they discovered that their slogan "finger lickin' good" came out as "eat your fingers off"

2. Chinese translation also proved difficult for Coke, which took two tries to get it right. They first tried Ke-kou-ke-la because when pronounced it sounded roughly like Coca-Cola. It wasn't until after thousands of signs had been printed that they discovered that the phrase means "bite the wax tadpole" or "female horse stuffed with wax", depending on the dialect. Second time around things worked out much better. After researching 40,000 Chinese characters, Coke came up with "ko-kou-ko-le" which translates roughly to the much more appropriate "happiness in the mouth".

3.  Things weren't much easier for Coke's arch-rival Pepsi. When they entered the Chinese market a few years ago, the translation of their slogan "Pepsi Brings you Back to Life" was a little more literal than they intended. In Chinese, the slogan meant, "Pepsi Brings Your Ancestors Back from the Grave".

4. General Motors had a perplexing problem when they introduced the Chevy Nova in South America. Despite their best efforts, they weren't selling many cars. They finally realized that in Spanish, "nova" means "it won't go". Sales improved dramatically after the car was renamed the "Caribe."

5. Things weren't any better for Ford when they introduced the Pinto in Brazil. After watching sales go nowhere, the company learned that "Pinto" is Brazilian slang for "tiny male genitals." Ford pried the nameplates off all of the cars and substituted them with "Corcel," which means horse.

6. Sometimes it's one word of a slogan that changes the whole meaning. When Parker Pen marketed a ballpoint pen in Mexico, its ads were supposed to say "It won't leak in your pocket and embarrass you." However, the company mistakenly thought the Spanish word "embarazar" meant embarrass. Instead the ads said "It won't leak in your pocket and make you pregnant."

7. Coors put its slogan, "Turn It Loose," into Spanish, where it was read as "Suffer From Diarrhea."

8. Scandinavian vacuum manufacturer Electrolux used the following in an American campaign: "Nothing sucks like an Electrolux"

9. The Dairy Association's huge success with the campaign "Got Milk?" prompted them to expand advertising to Mexico. It was soon brought to their attention the Spanish translation read "Are you lactating?"

10. American Motors tried to market its car, the “Matador,” in Puerto Rico based on an image of strength and courage, however, in Puerto Rico the word, literally translated, means “killer.” The inappropriate name is linked to the car’s lack of popularity because of the many hazardous roads in the country and the correlation with death made by consumers.

 

Other posts

Some funny cross cultural marketing and translation mistakes

What if stop signs were invented by a major corporation

Chuck Norris Programming facts

When CAPTCHA goes bad

Cheeseburgery hamburgers...

Monday, 11 July 2011 10:38:38 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Humor | Language Industry

Sometimes we need to create a CSV list of items, containing in a single row the different values of a column in a table.

This script just accomplish that, in the simplest manner possible:

DECLARE @theListOfValues VARCHAR(MAX)
SELECT @theListOfValues = COALESCE(@theListOfValues+',' , '') +
MyColumn
FROM 
MyTable
SELECT @theListOfValues

 

Other posts:

How to remove multiple whitespaces from a string with SQL Server 2005

Which performs better: ISNULL or COALESCE

How to remove leading zeros from the results of an SQL Query

Simple way to count characters and words using T-SQL

Domain registration and one full year of Web hosting for Free!

Monday, 11 July 2011 10:19:50 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
SQL
 Wednesday, 29 September 2010

This morning Microsoft released a security update that addresses the ASP.NET Security Vulnerability that I’ve blogged about this past week. 

From Scott Guthrie’s blog post we learn that the update should not require any code or configuration change to your existing ASP.NET applications. Also, if you apply the update to a live web-server, there will be some period of time when the web-server will be offline (although an OS reboot should not be required). You’ll want to schedule and coordinate your updates appropriately.

If you want to apply the update right now, you can go to the microsoft download center and download it. The update will also be released in the next scheduled Windows Update and Windows Server Update.

Other posts

19 great tips to enhance your SQL Server security

Intro to LDAP injection

The LoginProperty function in SQL Server 2005

Google Translator Hacked

SQL Injection joke

Wednesday, 29 September 2010 10:41:59 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
.NET | Security
 Wednesday, 22 September 2010

String-format-cheat-sheet

I don’t understand why but it seems that I can never remember the .NET string format syntax.

Then I found this. A very nice two-pager cheat-sheet containing all you need to know about the string format syntax.

Download it here.

 

Other posts:

How to: Use Active Directory to authenticate users

Sorting strings for real people - A human-friendly IComparer

How to set NTFS permissions using C#

How to restart a Windows service using C#

How to get the list of object modifications in SQL Server

Wednesday, 22 September 2010 09:44:16 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
.NET
 Wednesday, 15 September 2010

Seen on Visual Studio Magazine

Two security researchers, Thai Duong and Juliano Rizzo, have discovered a bug in the default encryption mechanism used to protect the cookies normally used to implement Forms Authentication in ASP.NET. Using their tool (the Padding Oracle Exploit Tool or POET), they can repeatedly modify an ASP.NET Forms Authentication cookie encrypted using AES and, by examining the errors returned, determine the Machine Key used to encrypt the cookie. The process is claimed to be 100 percent reliable and takes between 30 and 50 minutes for any site.

Once the Machine Key is determined, attackers can create bogus forms authentication cookies. If site designers have chosen the option to embed role information in the security cookie, then attackers could arbitrarily assign themselves to administrator roles. This exposure also affects other membership provider features, spoofing protection on the ViewState, and encrypted information that might be stored in cookies or otherwise be made available at the client.

While the exposure is both wide and immediate, the fix is simple. The hack exploits a bug in .NET's implementation of AES encryption. The solution is to switch to one of the other encryption mechanisms -- to 3DES, for instance. Since encryption for the membership and roles providers is handled by ASP.NET, no modification of existing code should be required for Forms Authentication.

The encryption method can be set in the web.config file for a site, in IIS 7 for a Web server, or in the config file for .NET on a server in %SYSTEMROOT%\Microsoft.NET\Framework\version\CONFIG\. On 64-bit systems, it must also be set in %SYSTEMROOT%\Microsoft.NET\Framework64\version\CONFIG\. A typical entry would look like this:

    <machineKey validationKey="AutoGenerate,IsolateApps"         
                           validation="3DES"                           
                           decryptionKey="AutoGenerate,IsolateApps"
                           decryption="3DES" />  

On a Web farm, this setting will have to be made on all the servers in the farm.

These settings are also used to prevent spoofing (ViewState data is encoded but not encrypted), so making this change will also switch the ViewState to using 3DES. Developers who are using AES in their code to encrypt information made available at the client should consider modifying their code to use a different encryption mechanism.

 

Other Posts:

Google instant makes searching for God harder

Tabnabbing: A New Kind Of Phishing Attack

Big news in security: 1024-bit RSA encryption cracked!

Tips to enhance your SQL Server security

What is LDAP injection?

Wednesday, 15 September 2010 08:35:36 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
.NET | News | Security
 Monday, 13 September 2010

As seen on techcrunch:

Googleinstant

If you think about all the things people search for on Google, “God” has to be pretty high up there, right? I mean, since the dawn of man, people have been searching for the meaning of life and its creator, so what better way to do that than with a search engine? But divinity apparently has nothing on cheap domain names.

When you try to do a search for “God” with the new Google Instant feature, it predicts that you’re going to type in “Godaddy” instead. If you hit a space after the “d”, it thinks you’re looking for “God of War”, the popular videogame. So the only way to actually search for “God” with this new Google Instant feature is to hit the search button.

To make Google Instant work, the search giant looks across all queries to find the most popular ones and then predicts what it thinks you’re going to type and auto-populates the results based on that. Clearly, both “Godaddy” and “God of War” are more popular queries on Google — something that is either humorous or sad depending on your level of religiousness.

Also kind of humorous is that “Godaddy” isn’t really the name of the company, it’s “Go Daddy” with a space (though the domain is of course godaddy.com). Also interesting is that a Go Daddy is a heavy Google AdWords user, and so the first result for the “God” query is a sponsored link for Go Daddy.

 

Other posts:

Some funny cross cultural marketing and translation mistakes

Make Your Site Faster with Google Page Speed

Big news in security: 1024-bit RSA encryption cracked

Google Translator Hacked

Georges Perec's palindrome

Monday, 13 September 2010 10:18:44 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
News
 Friday, 30 July 2010

Although cruel, cross cultural marketing mistakes are a humorous means of understanding the impact poor cultural awareness or translations can have on a product or company when selling abroad.

Enjoy,

1. Locum is a Swedish company. As most companies do at Christmas they sent out Christmas cards to customers. In 1991 they decided to give their logo a little holiday spirit by replacing the "o" in Locum with a heart. You can see the result...

ilovecum.jpg

2. The Japanese company Matsushita Electric was promoting a new Japanese PC for internet users. Panasonic created the new web browser and had received license to use the cartoon character Woody Woodpecker as an interactive internet guide.

The day before the huge marketing campaign, Panasonic realised its error and pulled the plug. Why? The ads for the new product featured the following slogan: "Touch Woody - The Internet Pecker." The company only realised its cross cultural blunder when an embarrassed American explain what "touch Woody's pecker" could be interpreted as!

3. The Swedish furniture giant IKEA somehow agreed upon the name "FARTFULL" for one of its new desks. Enough said..

4. In the late 1970s, Wang, the American computer company could not understand why its British branches were refusing to use its latest motto "Wang Cares". Of course, to British ears this sounds too close to "Wankers" which would not really give a very positive image to any company.

5. There are several examples of companies getting tangled up with bad translations of products due to the word "mist". We had "Irish Mist" (an alcoholic drink), "Mist Stick" (a curling iron from Clairol) and "Silver Mist" (Rolls Royce car) all flopping as "mist" in German means dung/manure. Fancy a glass of Irish dung?

6. "Traficante" and Italian mineral water found a great reception in Spain's underworld. In Spanish it translates as "drug dealer".

7. In 2002, Umbro the UK sports manufacturer had to withdraw its new trainers (sneakers) called the Zyklon. The firm received complaints from many organisations and individuals as it was the name of the gas used by the Nazi regime to murder millions of Jews in concentration camps.

8. Sharwoods, a UK food manufacturer, spent £6 million on a campaign to launch its new 'Bundh' sauces. It received calls from numerous Punjabi speakers telling them that "bundh" sounded just like the Punjabi word for "arse".

9. Honda introduced their new car "Fitta" into Nordic countries in 2001. If they had taken the time to undertake some cross cultural marketing research they may have discovered that "fitta" was an old word used in vulgar language to refer to a woman's genitals in Swedish, Norwegian and Danish. In the end they renamed it "Honda Jazz".

10. A nice cross cultural example of the fact that all pictures or symbols are not interpreted the same across the world: staff at the African port of Stevadores saw the "internationally recognised" symbol for "fragile" (i.e. broken wine glass) and presumed it was a box of broken glass. Rather than waste space they threw all the boxes into the sea!

 

Other Posts:

What if stop signs were invented by a major corporation

Huge List of Dumb and Crazy Laws in the United States

SQL Injection humor

How to know if your software project is doomed

Chuck Norris Programming facts

Friday, 30 July 2010 12:43:17 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Humor

Genetic algorithms are a form of evolutionary computation, a branch of artificial intelligence that focuses on evolving effective or optimal solutions to difficult problems, based on the biological theory of evolution.

Genetic algorithms are, at their core, a search/optimisation technique. They are a way of finding maximum/minimum solutions to problems and, can be effective when there is no algorithmic solution to the problem. An example here would be the ‘Traveling Salesman’ problem.

Genetic algorithms work by taking an initial population of potential solutions (referred to as individuals), selecting a subset of the population that has the highest fitness then using that subset to generate a second generation. From the second generation again a subset with the highest fitness is selected and used to generate a third generation. This repeats until either the ‘fittest’ individual is considered a good enough solution, or until a certain number of generations have passed.

There are advantage to using genetic algorithms to solve problems over more traditional methods like hill climbing.

  • Genetic algorithms can quickly produce good solutions though they may take a lot of time to find the best solution. This is a benefit when the problem is such that the absolute best solution is not necessary, just one that is ‘good enough’
  • They are not susceptible to getting trapped by local maxima.
  • They do not work on the entire search space one potential solution at a time, but rather work on populations of potential solutions, focusing towards more optimal areas of the search space.

A genetic algorithm will almost always find an optimal solution, given enough time. The main downside is that they may take a lot of time to find that optimal solution.

Components of a Genetic Algorithm

There are two main critical parts in setting up a genetic algorithm for a problem.

  • The encoding of the potential solutions into a form where they can be operated on.
  • The fitness function which defines which individuals are better than others, which are closer to the maximum that is being searched for.

Most of the design work when using genetic algorithms goes into those two problems.

Encoding

Encoding is the process of taking all the values that make up a potential solution and turning them into a form that the genetic algorithm can operate on.

The selection of an encoding is of utmost importance to the effectiveness of the entire process and a poor representation can make the entire problem much harder than it should. Unfortunately there has been little academic work done on the process of designing representations.

Often for genetic algorithms, the end result of the encoding will be a binary string. There are other variations of evolutionary computation that use other representations, from the arrays of real numbers used by evolutionary strategies to the code trees used by genetic programming.

Fitness function

Depending on the problem, the fitness function can be trivial to write or near-impossible. The design of the fitness function is completely based on the problem that is being solved.

There are two important considerations for a fitness function.

  • It must be deterministic.
  • It must be fast

If the fitness of an individual is assessed twice, it must come to the same value1. If the fitness function could return different values for the same individual, then it is of no use in determining the fittest individuals in the population and hence the genetic algorithm will not be able to identify the best solution to the problem.

The fitness of each individual is assessed at least once in each generation. The calculation of the fitness function is usually the most time consuming part of the entire process and the longer the fitness function takes to run, the longer the entire process is run

(1) There are cases where the fitness of an individual may depend on external factors which change over time. Hence a fitness function may give different values for one individual if calculated at different times. Genetic algorithms in a changing environment are a little beyond the scope of this entry.

Evolution Process

In order to create a new generation, the fittest individuals from the previous generation are taken and used to generate the next generation. There are two main operators that are used to generate a generation from the previous one. Crossover and mutation.

Crossover

Crossover involved taking two individuals, splitting each one’s encoded string and swapping parts to generate two new individuals.

Say we had two individuals with the following encoded strings (spaces added for clarity)
0000 0001 1111 1110
0101 1010 1100 0011
and we chose the splitting point for the crossover after the 4th bit, the resulting strings after the crossover will be
0000 1010 1100 0011
0101 0001 1111 1110

In genetic algorithms crossover is the primary operator used. What I described here was a single crossover. There are a number of other variations that can be used.

Mutation

Mutation is an operator applied to a single individual. It’s usually applied after crossover has generated new individuals. Mutation involves flipping a single bit somewhere in the encoded string.

Let’s take the two individuals that were generates by the crossover earlier and apply a random mutation to each
0000 1010 1100 0011
0101 0001 1111 1110
After
0000 1010 1101 0011
0101 0001 1011 1110

In genetic algorithms mutation is very seldom applied and only a small percentage of individuals in a generation will be affected by the mutation operator.

Example

As a quick example let’s manually evolve a simple function to see how the whole thing works.

Let’s say I have an array of 4 numbers (call it num) between 0 and 15. I want to know what values give me the best value for the following.

num[1]-num[2]-num[3]+num[4]

I know, that’s simple enough that we could work out the optimal solution just by eye. Not the point. This is enough to do a quick and effective demo with.

I’m going to encode that by simply converting the numbers in the array to binary and concatenating the binary representations of the 4 numbers (spaces just added for clarity). The fitness function is already defined. I’m going to start with an initial population of eight individuals.

1111 1111 1100 1110 – fitness = (15-15-12+14) = 2
0101 1010 1100 0011 – fitness = (9-10-12+3) = –10
1011 0111 0011 1111 – fitness = (11-7-3+15) = 16
1111 1001 1010 0011 – fitness = (15-9-10+3) = -1
1010 1010 1010 1010 – fitness = (10-10-10+10) = 0
1000 0010 0111 0110 – fitness = (8-2-7+6) = 5
0000 0001 1111 1110 – fitness = (0-1-15+14) = -2
1010 0101 0010 0101 – fitness = (10-5-2+5) = 8

From this I’m going to take the 4 individuals with the highest fitness, use crossover operations (with the crossover point exactly in the middle) between them until I have 8 individuals for the 2nd generation and then apply a single bit mutation to one of the individuals (detailed steps left as an exercise for the reader)

1111 1111 0011 1111 – fitness = (15-15-3+15) = 12
1011 0111 1100 1110 – fitness = (11-7-12+14) = 6
1111 1011 0010 0101 – fitness = (15-11-2+5) = 7
1010 0101 1100 1110 – fitness = (10-5-7+6) = 4
1011 0111 0111 0110 – fitness = (11-7-7+6) = 3
1000 0010 0011 1111 – fitness = (8-2-3+15) = 18
1010 0101 0111 0110 – fitness = (10-5-7+6) = 4
1000 0010 0010 0101 – fitness = (8-2-2+5) = 9

We can already see an improvement. The average and maximum fitness is much higher than for the first generation. I’ll do one more generation in this example, again taking the 4 fittest individuals, crossing over to generate 8 new individuals and then applying a single bit mutation to two individuals. This time however, the crossover point will between the 4th and 5th bit.

1000 1111 0011 1111 – fitness = (8-15-3+15) = 5
1111 0010 0011 1111 – fitness = (15-2-3+15) = 25
1000 1011 0010 0101 – fitness = (8-11-2+5) = 0
1111 0010 0011 1111 – fitness = (15-2-3+15) = 25
1111 0010 1010 0101 – fitness = (15-2-10+5) = 8
1000 0111 0011 1111 – fitness = (8-7-3+15) = 13
1111 0010 0010 0101 – fitness = (15-2-2+5) = 16
1000 1011 0010 0101 – fitness = (8-11-2+5) = 0

I think that’s enough for this example. We’re getting fairly close to the best possible solution (15,0,0,15), close enough to see how this works. The population size was very low, that’s why there are duplicates appearing in the results of the crossover. With a larger search space there would be a lot more diversity.

I hope that anyone still reading found this brief diversion into the realms of AI interesting.

 

Friday, 30 July 2010 10:27:32 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
General | Artificial Intelligence
 Thursday, 29 July 2010

Here is a simple way to get a specific character count (for example, count the number of occurences of the character ‘0’) and the word count of a varchar string using T-SQL.

-- Count the number of specific characters, in this case, ‘0’

Declare @value varchar(100)

Set @value = 'SQL Server 2000, SQL Server 2005, SQL Server 2008'

Select Len(@value) - Len(Replace(@value, '0', ''))

 

-- Count the number of words

DECLARE @String VARCHAR(100)

SELECT @String = 'SQL Server 2005 Stan test code'

SELECT LEN(@String) - LEN(REPLACE(@String, ' ', '')) + 1

 

As you see this is quite straightforward, the script simply substract the length of the string minus the character searched from the full length of the string, giving as a result the character count. The script counting words is simply counting the space characters.

 

Other posts:

How to remove multiple whitespaces from a string with SQL Server 2005

How to generate random numbers with a SQL query

How to remove leading zeros from the results of an SQL Query

How to Find The List Of Unused Tables Since The Last SQL Server Restart

Which performs better: ISNULL or COALESCE

Thursday, 29 July 2010 08:58:32 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL

Navigation
Advertisement
About the author/Disclaimer

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2017
Stanislas Biron
Sign In
Statistics
Total Posts: 135
This Year: 0
This Month: 0
This Week: 0
Comments: 1
All Content © 2017, Stanislas Biron