Developping for the translation industry RSS 2.0

 Tuesday, February 23, 2010

China-us-flagsThe big news over the past few months were the Aurora attacks and how they seemed to originate from China, last month Microsoft took the unusual step and released an Out-Of-Band patch for the IE6 0-Day vulnerability used in the attacks.

It was always thought the exploit originated from China due to parts of the code only being discovered on Chinese language sites, the latest news is that the actual origin of the code has been discovered by US investigators.

US investigators have pinpointed the author of a key piece of code used in the alleged cyber attacks on Google and at least 33 other companies last year, according to a new report.

Citing a researcher working for the US government, The Financial Times reports that a Chinese freelance security consultant in his 30s wrote the code that exploited a hole in Microsoft’s Internet Explorer browser. The report also says that Chinese authorities had “special access” to this consultant’s work and that he posted at least a portion of the code to a hacking forum.

According to The Financial Times report, the unnamed security consultant who wrote the exploit code is not a full-time government worker and did not launch the attacks himself. In fact, the FT says, he “would prefer not to be used in such offensive efforts.”

The reports says that when he posted the code to the hacking forum, he described it as something he was “working on.”

With a January blog post, Google announced that attacks originating from China had pilfered unspecified intellectual property from the company, and Microsoft later said the attack had exploited a hole in its Internet Explorer 6 browser. According to security researchers, at least 33 other companies were targeted by similar attacks.

Put simply, this means that the “consultant” who created the code posted a proof of concept for this exploit on a hacking forum. Then someone took this proof of concept, turned it into a working exploit and attacked 33 US based companies.

It will be interesting to watch how this story will unfold after this and if it’s going to increase the tension between the US and China governments. The whole cyberwar has been going on for quite a while now with both sides trying to secretly steal information from each other.

So far the author of the code has not been named and his real identity or purpose is still a little vague.

Source: The Register


Other posts:

Google Will Pay 500$ Bounty For Each Chrome Browser Bugs You Find

Google Translator Hacked

Password aren't a good defense?

In the news: Google negotiating cooperation with the NSA

Some tips to enhance your SQL Server security

How To: Create an Outlook 2003 addin using VSTO SE and Visual Studio 2005

Tuesday, February 23, 2010 10:33:37 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
News | Security
 Monday, February 22, 2010

Scott Hanselman recently has published the survey results of : What .NET Framework features do you use?

Quite interesting numbers! It interesting to see that such a high percentage of the respondants still use winforms. It’s also interesting to note that the number of Silverlight users is higher than the number of WPF users.

The survey also shows that WebForms, Ajax, WCF and Linq2SQL are clearly the technologies of choice as of now.


Other posts:

How-To: hiring and managing geeks

Sorting strings for real people - A human-friendly IComparer

How to set NTFS permissions using C# 2005

How to Use Active Directory to authenticate users in C#

How to find monday of the current week using T-SQL

Monday, February 22, 2010 11:13:35 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
 Wednesday, February 10, 2010

PhotographerI like to use to use images to help illustrate the theme or point of a blog post. It’s a proven “best practice” in blogging and I highly recommend that every blogger do it.

One trick for easily finding and properly using images in your blog posts is to search the creative commons licensed photos on the photo sharing site Flickr.

So, what’s Creative Commons?

Creative Commons is a non-profit organization that has created a standardized set of tools for granting various levels of permission for people to use creative works freely. The author or in this case photographer of the works designates a type of license and then Flickr allows you to sort through and find only photos that are free to be used for blog posts. I choose to use photos that carry the attribution/share alike license. This means that I may use the image here as long as I attribute the image to the Flickr user’s account where I found it. Here’s Flickr’s description of CC licenses.

So, here’s how to find and grab great images.

  1. Surf to the Flickr Creative Commons Search Page – all images you search for here are free to use with proper attribution
  2. Search for a specific phrase or concept and choose the image that fits
  3. Click on “all sizes” and choose the size you wish to post on your blog
  4. Right click the image and choose “copy image location” – use this path to paste into your blog post where you want the image to appear
  5. Somewhere in your post add the words – Image credit and the link to the Flickr account where you found the image (see at the bottom of the post)

To be a good photo user make sure you add your own images and make the available through the proper CC license – you can make this a default Flickr account setting.

Image credit: dashitnow

Other Posts:

The Best Damn Web Marketing Checklist, Period!

What Are Customers Saying About You Online?

8 easy tips to drive traffic from search engines to your site

Wednesday, February 10, 2010 9:35:46 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
 Tuesday, February 09, 2010

This is a very funny video that I found a while ago… Tought I should share it with you all!


Other posts:

List of Crazy Laws in the United States

When CAPTCHA goes bad

Chuck Norris Programming facts and More Programming Chuck Norris facts

Remember Windows ME?

SQL Injection humor 

Tuesday, February 09, 2010 3:22:16 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -

Here is a quick and easy way to remove multiple whitespaces from a string, leaving only one space character between tokens.

               (@string VARCHAR
    SET @string = Ltrim
    WHILE Charindex
('  ',@string) > 1
      SET @string = Replace
(@string,'  ',' ')
    RETURN @string


Other Posts:

How to generate random numbers with a T-SQL query

How to insert a file in an image column in SQL Server 2005

How to track the growth of your SQL Server database

SQL Server indexing best practices and guidelines

How to remove leading zeros from the results of an SQL Query

Tuesday, February 09, 2010 1:24:17 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL

Google is developing software for the first phone capable of translating foreign languages almost instantly — like the Babel Fish in The Hitchhiker’s Guide to the Galaxy.

By building on existing technologies in voice recognition and automatic translation, Google hopes to have a basic system ready within a couple of years. If it works, it could eventually transform communication among speakers of the world’s 6,000-plus languages.

The company has already created an automatic system for translating text on computers, which is being honed by scanning millions of multi-lingual websites and documents. So far it covers 52 languages, adding Haitian Creole last week.

Google also has a voice recognition system that enables phone users to conduct web searches by speaking commands into their phones rather than typing them in.

“We think speech-to-speech translation should be possible and work reasonably well in a few years’ time,” said Franz Och, Google’s head of translation services.

“Clearly, for it to work smoothly, you need a combination of high-accuracy machine translation and high-accuracy voice recognition, and that’s what we’re working on.

“If you look at the progress in machine translation and corresponding advances in voice recognition, there has been huge progress recently.”

Although automatic text translators are now reasonably effective, voice recognition has proved more challenging.

“Everyone has a different voice, accent and pitch,” said Och. “But recognition should be effective with mobile phones because by nature they are personal to you. The phone should get a feel for your voice from past voice search queries, for example.”

The translation software is likely to become more accurate the more it is used. And while some translation systems use crude rules based on the grammar of languages, Google is exploiting its vast database of websites and translated documents to improve the accuracy of its system.

“The more data we input, the better the quality,” said Och. There is no shortage of help. “There are a lot of language enthusiasts out there,” he said.

However, some experts believe the hurdles to live translation remain high. David Crystal, honorary professor of linguistics at Bangor University, said: “The problem with speech recognition is the variability in accents. No system at the moment can handle that properly.

“Maybe Google will be able to get there faster than everyone else, but I think it’s unlikely we’ll have a speech device in the next few years that could handle high-speed Glaswegian slang.

“The future, though, looks very interesting. If you have a Babel Fish, the need to learn foreign languages is removed.”

In the Hitchhiker’s Guide to the Galaxy, the small, yellow Babel Fish was capable of translating any language when placed in the ear. It sparked a bloody war because everyone became able to understand what other people were saying.

Source: Times Online


Other Posts:

Google Willing To Pay 500$ Bounty For Each Chrome Browser Bugs You Find

Silverlight Game Creation Tutorials

Facts and Figures about the Language Industry

Google Translator Hacked

Compendium of Dumb Laws in the United States

Tuesday, February 09, 2010 9:39:00 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Language Industry | News
 Friday, February 05, 2010

You can place a robots.txt file in the root of your site to help inform search engines and other bots about the areas of your site that you don’t want them to access. For example, you may not want bots to access the content of your images folder: 

User-agent: *
Disallow: /images/

You can also provide instructions for particular bots. For example, to exclude Google image search from your entire site, use this: 

User-agent: Googlebot-Image
Disallow: /

The robots.txt standard is unfortunately very limited; it only supports the User-agent and Disallow fields, and the only wildcard allowed is when you specify it by itself in User-agent, as in the previous example.

Google has introduced support for a couple of extensions to the robots.txt standard. First, you can use limited patterns in pathnames. You can also specify an Allow clause. Since those extensions are specific to Google, you should probably only use them with one of the Google user agents or with Googlebot, which all of its bots recognize.

For example, you can block PNG files from all Google user agents as follows: 

User-agent: Googlebot
Disallow: /*.png$

As with regular expressions, the asterisk means to match any sequence of characters, and the dollar sign means to match the end of the string. Those are the only two pattern matching characters that Google supports.

To disable all bots except for Google, use this: 

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /

To exclude pages with sort as the first element of a query string that can be followed by any other text, use this:

User-agent: Googlebot
Disallow: /*?sort

This clause will also work only woth the Google bots.


Other posts:

White House new Robots.txt

8 easy tips to drive traffic from search engines to your site

Huge List of Dumb and Crazy Laws in the United States

Tools for Web developers

Friday, February 05, 2010 2:26:11 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -

In January, Google went public with news that some of its systems had been hacked, along with those of a number of US-based companies. The attacks had targeted both accounts maintained by political activists and commercial code, and Google pointed the finger straight at China, vowing to change its entire approach to business in that country. But a report now suggests that the company is also looking to beef up its internal defenses to prevent a repeat of the attacks.

The Washington Post is reporting that Google has started negotiations with the US National Security Agency about a collaborative effort to analyze the attack and figure out how best to prevent a recurrence. The Post is citing confidential sources, as the deal isn't final and, even if it were, it's unlikely that Google would seek to publicize it.

For starters, both organizations have already been the target of many complaints by privacy advocates, the NSA for its domestic surveillance efforts, Google for its data retention policies. The combination of the two would clearly make the advocates far more uneasy, and might help them make their case with the wider public. Meanwhile, as the report notes, private companies have often been loath to share information about their proprietary systems with the government for a variety of reasons.

That may explain why the negotiations have been going slowly, as the NSA would clearly need access to and understanding of Google's infrastructure in order to fully evaluate the attacks and future risks. And that's precisely the sort of proprietary information that Google is presumably reluctant to provide anyone with—even a highly secretive organization like the NSA.

Other posts:

Google Willing To Pay 500$ Bounty For Each Chrome Browser Bugs You Find

Google Translator Hacked

BusinessWeek hit by SQL Injection attack

Password aren't a good defense?

Friday, February 05, 2010 10:30:04 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
News | Security
 Wednesday, February 03, 2010

The sys.objects system table offers you some information on the last modifications made on any database object. As a quick example, the following script queries for the list of user tables modified since the start of this year.

SET @date = '2010-01-01'

SELECT   name, 
FROM     sys.objects
         AND modify_date >= @date
ORDER BY modify_date

Now this knowledge can help you manage your SQL Servers more easily. You could, for example, create a script that runs every night and send you within an email the list of objects modified the last day.

The list of available types to filter your search are:

AF Aggregate function (CLR)
C CHECK constraint
D DEFAULT (constraint or stand-alone)
F FOREIGN KEY constraint
FN SQL scalar function
FS Assembly (CLR) scalar-function
FT Assembly (CLR) table-valued function
IF SQL inline table-valued function
IT Internal table
P SQL Stored Procedure
PC Assembly (CLR) stored-procedure
PG Plan guide
PK PRIMARY KEY constraint
R Rule (old-style, stand-alone)
RF Replication-filter-procedure
S System base table
SN Synonym
SQ Service queue
TA Assembly (CLR) DML trigger
TF SQL table-valued-function
TR SQL DML trigger
TT Table type
U Table (user-defined)
UQ UNIQUE constraint
V View
X Extended stored procedure


Other posts:

How to: Find The List Of Unused Tables Since The Last SQL Server Restart

Differences between temporary tables and table variables

Using derived tables to boost SQL performance

How to track the growth of your database

Wednesday, February 03, 2010 11:00:54 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL
 Monday, February 01, 2010

This script will return a list of user tables in the database, along with the last time a SELECT was executed against them (including any views that include data from that table). This can be used to determine if a table is not in use anymore.

Statistics are reset when the SQL Server sevice is restarted, so this query will only return activity since that time. Also, just because there's no activity doesn't mean it's safe to remove an object - some tables may only be used during monthly or annual processes, and so they wouldn't show any activity except during those brief intervals.

WITH lastactivity(objectid,lastaction)
     AS (SELECT object_id      AS tablename,
                last_user_seek AS lastaction
         FROM   sys.dm_db_index_usage_stats u
         WHERE  database_id = Db_id
         SELECT object_id      AS tablename,
                last_user_scan AS lastaction
         FROM   sys.dm_db_index_usage_stats u
         WHERE  database_id = Db_id
         SELECT object_id        AS tablename,
                last_user_lookup AS lastaction
         FROM   sys.dm_db_index_usage_stats u
         WHERE  database_id = Db_id
SELECT   Object_name
(so.object_id) AS tablename,
(la.lastaction)        AS lastselect
FROM     sys.objects so
         LEFT JOIN lastactivity la
           ON so.object_id = la.objectid
WHERE    so.TYPE = 'U'
         AND so.object_id > 100
GROUP BY Object_name
ORDER BY Object_name


Other posts:

How to generate random numbers within a T-SQL query

SQL Server SPACE function

SQL Injection humor

Differences between temporary tables and table variables

How to track the growth of your database

How to get the total number of rows in a database

Using derived tables to boost SQL performance

Monday, February 01, 2010 10:36:17 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL

This is a pretty interesting development from Google and also seems to be coming much more common now, companies openly offering payments for bugs/vulnerabilities discovered in their software. They already used that strategy to find bugs last year with their Native Client Security hacking contest. This time they offer $500 for most vulnerabilities, $1,337 for 'particularly clever' flaws. You can see the blog post on the Chromium blog here.

It’s a chance for the white-hat guys to earn a few bucks, but honestly I don’t think it’s going to change anything. Especially not when we’re talking $500 per vulnerability because a serious browser 0-day exploit that can allow execution of malware will go for 100 times that much on the black market. Even for the particularly severe or clever bugs worth $1,337, that’s still peanuts compared to what they can sell the exploit for on the black market.

I hope it helps though and gives some legitimate security researches a little more incentive to focus on Chrome, the bad guys won’t pay much attention though as Chrome is still a relatively small player in the browser world.

From the article at Network World

“We are hoping that … this program will encourage new individuals to participate in Chromium security,” said Evans. “The more people involved in scrutinizing Chromium’s code and behavior, the more secure our millions of users will be.”

“Internet Explorer, Safari, Firefox…those browsers have been out there for a long time,” said Pedram Amini, manager of the security research team at 3com’s Austin, Tex.-based TippingPoint, which operates Zero Day Initiative (ZDI), one of the two best-known bug-bounty programs. “But Chrome, and now Chrome OS, need researchers. Google needs people to put eyes on the target.”

Google’s new bounty program isn’t the first from a software vendor looking for help rooting out vulnerabilities in its own code, but it’s the largest company to step forward, Amini said. Microsoft , for example, has traditionally dismissed any calls that it pay for vulnerabilities. “This will be beneficial to Google,” Amini added. “There are actually very few vendors who play in the bounty market, but Google doing it is definitely interesting.”

I don’t realistically expect any groundbreaking bugs to come out of this initiative, but I think a few people might bust out their browser fuzzing tools and see what they can find.

Worth a bit of effort if you can find 10 decent bugs in a couple of hours and net yourself $5000usd.

You can see the on the chromium project severity guidelines page the different severity ranking for bugs.


Other posts:

Google Translator Hacked

Some tips to enhance your SQL Server security

What is LDAP injection?

Some tips to enhance your SQL Server security

How to generate random numbers within a T-SQL query

Monday, February 01, 2010 10:22:17 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
News | Security
 Friday, January 29, 2010

With all of the news about hacked e-mail accounts, it isn’t a big surprise that other Google services can be manipulated, too. Yesterday, politicking or pranking Russian translators forced a Google Translate mistranslation of four segments — “USA is to blame,” “Russia is to blame,” “Obama is to blame,” and “Medvedev is to blame” into English from Russian (click here to see a screenshot).

Google Переводчик (Translator) made the U.S. and President Obama blameless in the Russian translation (”USA is not to blame,” and “Obama is not to blame”, while placing blame on Russia and President Medvedev. Naturally, soon after the news went up, Google quickly fixed the translations.

According to Moscow News:

The same is true if the word combination is translated into Ukrainian and Belorussian. However, if the output translation is set to Spanish, French, German, and other European languages, it is translated correctly. […]

"These are translation bombs" said Alla Zabrovskaya, Google's Russian Public Relations Director.  "We are not always able to weed them out, and it is good that our users find them, and let us know about them.

But the question that remains is: how many more of those mistranslations (or “translation bombs” as they call them) exists in the Google translation engine (or any others automatic translation engine)? Some of them should be very easy to spot (such as translation “White House” with “ Visit”) but others will be spotted only through careful analysis of the translation.

The lessons learned here:

  • Crowdsourcing applications need protection against malicious manipulation because the wisdom of the crowd will more and more reflect the politics of its members.
  • Online translation applications are only as reliable as the crowd that feeds them. You should therefore never use those applications to translate important documents or messages. “Machine translation” is only useful to grasp the general meaning of a piece of content but nothing more.
  • If you need real professional translations, you should work with a real translation provider. Unlike automatic translation engines, they have the ability to garantee you that your message will be the same in every language.


Other posts:

In the news: Bing translator now supports Haitian Creole

Facts and Figures about the Language Industry

Some tips to enhance your SQL Server security

Big news in the translation industry

Friday, January 29, 2010 10:47:40 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Language Industry | News

About the author/Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2019
Stanislas Biron
Sign In
Total Posts: 135
This Year: 0
This Month: 0
This Week: 0
Comments: 1
All Content © 2019, Stanislas Biron