Developping for the translation industry RSS 2.0



 Friday, 22 August 2008

Found this on endgadget:

Generally, when someone makes a teddy bear-themed gadget, his/her intention is to overwhelm bystanders with cuteness. But whoever created this little guy, whose head has to be removed in order to access the internal USB drive, must have watched one too many Tim Burton movies. No word on how much it holds or if there are any plans to make these available for purchase, but with your own bear, a thumb drive, some thread and a closet full of skeletons, you can probably make your own without too much effort.

Teddy_usb

 

More humorous posts here:

When CAPTCHA goes bad

Programming is like sex

 

Friday, 22 August 2008 14:14:20 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Humor

Here are some guidelines that I gathered on indexing and boosting SQL Server query performance. I hope that those tips will be as useful to you as they were to me!

  1. Periodically, run the Index Wizard or Database Engine Tuning Advisor against current Proiler traces to identify potentially missing indexes.
  2. Remove indexes that are never used.  This will enhance Inserts/Updates/Deletes performance because the database engine will have fewer indexes to maintain when those operations occurs.
  3. Normally, every table should have a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases — such as an identity column. In many cases, the primary key is the ideal column for a clustered index.
  4. Indexes should be considered on all columns that are frequently accessed by the JOIN, WHERE, ORDER BY, GROUP BY, TOP, and DISTINCT clauses.
  5. When creating indexes, try to make them unique indexes if at all possible. SQL Server can often search through a unique index faster than a non-unique index because in a unique index, each row is unique, and once the needed record is found, SQL Server doesn’t have to look any further.
  6. If a column in a table is not at least 95% unique, then most likely the query optimizer will not use a non-clustered index based on that column. Because of this, you generally don’t want to addnon-clustered indexes to columns that aren’t at least 95% unique.
  7. This seems obvious but some people forget to follow this simple advice : Don't automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by the queries run against the table.

If you are like me and want to know more about how SQL Server manages his indexes, take a look at the sysindexes table that is part of every database.  You simply have to run “SELECT * FROM sysindexes”.

Here are some of the more interesting fields found in this table:

  • dpages: If the indid value is 0 or 1, then dpages is the count of the data pages used for the index. If the indid is 255, then dpages equals zero. In all other cases, dpages is the count of the non-clustered index pages used in the index.
  • id: Refers to the id of the table this index belongs to.
  • indid: This column indicates the type of index. For example, 1 is for a clustered table, a value greater than 1 is for a non-clustered index, and a 255 indicates that the table has text or image data.
  • OrigFillFactor: This is the original fillfactor used when the index was first created, but it is not maintained over time.
  • statversion: Tracks the number of times that statistics have been updated.
  • status: 2 = unique index, 16 = clustered index, 64 = index allows duplicate rows, 2048 = the index is used to enforce the Primary Key constraint, 4096 = the index is used to enforce the Unique constraint. These values are additive, and the value you see in this column may be a sum of two or more of these options.  For example a valeu of 2066 means that the index is clustered, unique and that it’s used to enforce the Primary key constraint.
  • used: If the indid value is 0 or 1, then used is the number of total pages used for all index and table data. If indid is 255, used is the number of pages for text or image data. In all other cases, used is the number of pages in the index.

Other popular SQL Posts :

How to insert a file in an image column in SQL Server 2005

How to get the total number of rows in a database

How to remove leading zeros within an SQL Query

Friday, 22 August 2008 11:40:31 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
SQL
Usertestingcom

Usability testing is a technique used to evaluate a product by testing it on users. This is, in my opinion, the best way to get good feedback on a website or product.

UserTesting.com is a Web startup where you can enroll and submit a site for a usability test. Real users then log-in/enroll/use you site or service, record everything and sends you a flash video with their commentaries. 

Here’s how it works:

  • You sign up for user testing, specifying the demographic profile of your target audience and how many user testers you want (one user costs $19, five users cost $95).
  • Users record their screen and voice as they use your website, speaking their thoughts as they browse.
  • You watch and listen to them use your site. Each user’s session - mouse movements, clicks, keystrokes, and spoken comments - is saved as a Flash video for you to watch.
  • You read their review.
    • What they liked.
    • What they didn’t like.
    • What would have caused them to leave your site.

That means that, for a ridiculous amount of money (less than 100$), you can get tremendous feedback on your site, feedback that you may never have otherwise.

Great idea guys and keep on the good work!

If you liked this post, you might also like : What are your customers saying about you online?

Friday, 22 August 2008 11:00:32 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Business | Marketing
 Thursday, 21 August 2008

What is a portable program ? : A portable program is a piece of software that you can carry around with you on a portable device and use on any other computer. It can be your email program, your browser, system recovery tools or even an operating system. The coolest part about it, is that all of your data and settings are always stored on a thumbdrive so when you unplug the device, none of your personal data is left behind.

This is the first of x posts on different portable software/tools.

  • Nvu : Easy-to-use webpage editor. Simple alternative to Dreamweaver and Microsft Frontpage
  • Server2Go : Apache webserver
  • InstantRails : Contains Ruby, Rails, Apache, and MySQL, all preconfigured and ready to run.
  • Putty : Telnet and SSH client
  • Follow-Me IP : Displays your external IP address
  • XAMPP : Integrated server package of Apache, mySQL, PHP and Perl. Just Unzip and Run
  • HTTP File Server : Simple and easy-to-use file server for personal file sharing.
  • CurrPorts : Lets you view a list of ports that are currently in use, along with applications that use them
  • Quick’n Easy FTP Server : Portble FTP Server.
Thursday, 21 August 2008 16:17:32 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Tools | Portable Software
 Monday, 11 August 2008

PasswordPostIt

Randal Stross of the New York Times explains why many experts propose dropping website passwords entirely for a security system based on cryptography.

The best password is a long, nonsensical string of letters and numbers and punctuation marks, a combination never put together before. Some admirable people actually do memorize random strings of characters for their passwords — and replace them with other random strings every couple of months.

Then there’s the rest of us, selecting the short, the familiar and the easiest to remember. And holding onto it forever.

I once felt ashamed about failing to follow best practices for password selection — but no more. Computer security experts say that choosing hard-to-guess passwords ultimately brings little security protection. Passwords won’t keep us safe from identity theft, no matter how clever we are in choosing them.

Read Full story here.

Monday, 11 August 2008 13:19:25 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
General
  • The system works because Chuck Norris tells it to work
  • Chuck Norris doesn't need a test suite. The test suite needs Chuck Norris.
  • CPUs run faster to get away from Chuck Norris
  • Chuck Norris normalizes all schema just by inserting random data
  • Packets travel faster than the speed of light for Chuck Norris, but he can still catch them
  • Chuck Norris's brain is his revision control, and it works better than git
  • Chuck Norris can finish an infite loop in 1.3 seconds.
  • Code written by Chuck Norris cannot be optimized.
  • Chuck Norris never dies.  He simply returns 0.
  • Chuck Norris can break Moore's Law
  • Chuck Norris doesn't need compilers nor editors. He roundhouse kicks the disk and the bytecode appears.
  • Chuck Norris doesn't use GOTO. Code comes to him.
  • There is no theory of probability, just a list of events that Chuck Norris allows to occur.  
  • 90% of the worlds spam is handtyped by Chuck Norris. It takes him only 3 minutes.
  • Chuck Norris can parse invalid XML
  • Every time you don't use "use strict" Chuck Norris kills a kitty.
  • The best compression algorithm in existence are Chuck Norris fists.
  • Chuck Norris can divide by 0.
  • Chuck Norris can compile syntax errors
  • The one true bracing style is the one Chuck Norris uses.
  • Every program Chuck Norris has written can be run backwards. It will rollback whatever it did.
  • No matter how you encrypt your traffic, Chuck Norris can read it by just looking at the cable. His ears can intercept wifi transmissions.
  • Chuck Norris can enrich himself simply by hacking your bank account. He does not do this because there is no challenge in it.
Monday, 11 August 2008 13:16:11 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Humor
 Tuesday, 05 August 2008

Google-translation-center

The word on the street is that google is about to launch a new translation service.  Called “Google Translation Center”, this service will:

  • Connect translators with clients
  • Let translators work for free or charge their clients for their work.
  • Let translators translate their documents online
  • Provide translators with a CAT (computer assisted translation) tool similar to the other tools available on the market

From the article at techcrunch:

If you have a document that needs translating, you can upload it and request a translator to work on it, according to the marketing information on the site. The Translation Center is set up as a marketplace for matching translators with people who need texts translated. It supports both paid translations and volunteer ones.

Also, Google doesn’t want to take part, for now, in the payment process.  They state in their terms of service:

Your interaction with any third party participant(s) or user(s) within Google Translation Center, including payment and delivery of goods and services, and any other terms, conditions, warranties or representations associated with such dealings, are solely between you and such third party participant(s) or user(s) and Google is not involved in such dealings.

Translations created in Google Translation Center are purely between the translation requester and the translators.

As a R&D Director for a translation firm in Canada, this news rapidly caught my eye.  Here is my breakdown of the impact this new service will have and my humble predictions:

So, what does all of this means for the translation industry

For translator networks:

This will surely steal business from a lot of web sites connecting translators to clients such as elance and craiglist, but not enough to get them out of business since they have more than translation projects in their portfolio.

For professional freelance translators:

For a lot of them, this will probably become their primary portal since Google is very good at indexing other sources of data than just theirs (just check the sources of the videos featured on google video and you will see what I mean).  They will probably index every translator gig available in the world and provide translators with a portal to search, maybe bid on them and execute the translation.

For professional translation firms:

For translation firms, this is neither a good or a bad news. They will lose maybe a handful of customers due to the fact that they will get very cheap translations on Google platform.  But, this is one industry where the saying “You get what you pay for” is really true. You won’t have any quality assurance when using this kind of service and, for many customers, this matters a lot. The quality of the corporate communications is a mirror of the company’s professionalism. And when you are a major bank, or in the medical industry (where a typo in a prescription can effectively kill someone), you can’t afford low quality translation. And you never will be safe with the quality of the translation provided by Google’s service (or any other online service for that matter) because the reviser might be your old Uncle Joe who runs only Word’s spell-checker on your document.

For translator tools software vendors:

This will probably be the main spot in the industry where the impact if this service will be felt.  For these vendors, the whole market of freelancers is at risk since they will have access to a CAT tool and translation memories for free. The only market that will be left for them after the service will be mainstream is the big translation firms, for the reasons stated above.

For the future of Google’s platform:

The big challenge for google with this platform is to keep away the spammers.  How easy will it be to log-in as a “fake translator” add advertising into a document. Then, when the client get his translation, he will be directly hit by the ad when reviewing this document.  Or worse, the ad won’t be caught (very possible case since you won’t know every language your document/brochure/Web site/etc. has been translated into) and will be published as a part of that document. The worst case scenario for Google is that all the email spammers will use their platform to publish their ads, since the email rarely even get opened by the target of spammers.  But inserting spam as part of a translation in a legitimate document will be a lot more effective.

 

UPDATE: Google removed most of the pages and reference documents (all URLs are now redirected to google’s main page).

 

Tuesday, 05 August 2008 10:28:52 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Language Industry | News
 Thursday, 31 July 2008

AppScout has a good article on the new Amazon’s payment service:

Amazon on Tuesday unveiled two new payment options that allow Web site owners to shift payment transactions to the online retailer.

With Checkout by Amazon, webmasters will get help from Amazon in managing shipping charges, sales tax, promotions, and post-sale activities including shipments, refunds, cancellations, and charge backs.

Amazon Simple Pay, meanwhile, is a less complicated option for those who don't need Amazon's end-to-end checkout pipeline and order management capabilities.

To enable Checkout by Amazon, webmasters must add Amazon 1-Click to their Amazon.com account and insert a few lines of code into their Web site template. When customers go to check out of the site, they click the "Checkout with Amazon" button, a widget pops up asking for a shipping address, and customers buy via Amazon's 1-Click. Amazon sends them a confirmation e-mail, and users remain on your site.

Click here to read more

Thursday, 31 July 2008 08:37:49 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
General | News
 Wednesday, 30 July 2008

Please, go see this video. This is, sadly, soo true:

http://tinyurl.com/6s4s8g

Wednesday, 30 July 2008 16:04:50 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Humor

CAPTCHAs are everywhere now.  When you want to open an account, anywhere, you will encounter one of those.  But there are times when the programmers probably needed a spec describing exactly what the expression “proving that you are a human” means…

Corey Smith found some of the worst CAPTCHAs on the Web.  Here is my personal favourite:

Mathcaptcha

Wednesday, 30 July 2008 13:26:43 (Eastern Standard Time, UTC-05:00)  #    Comments [1] -
General | Humor

A common thing you may want to do when dealing with transactions involving various documents and files is insert them into you SQL Server database.  The following code snippet let you load a file from the disk and insert it into your database.

INSERT INTO myTable (documentData)
SELECT * FROM
OPENROWSET(BULK N'c:\myDocument.doc', SINGLE_BLOB) as dt

Note that you need to name the select statement (here, I named it “dt”) or you will get this error message:

Server: Msg 491, Level 16, State 1, Line 3
A correlation name must be specified for the bulk rowset in the from clause.


Wednesday, 30 July 2008 12:38:15 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL
 Thursday, 10 July 2008

Using SQL Server 2005 new function ROW_NUMBER() makes this really easy.

All you have to do is to add the function ROW_NUMBER, with the OVER() clause as such :

SELECT ROW_NUMBER() 
        OVER (ORDER BY EmployeeName) AS Row, 
    EmployeeId, EmployeeName, Salary 
FROM Employees
The OVER clause needs an “order by” parameter to know how to sort the rows for proper numbering.
 
Thursday, 10 July 2008 15:58:08 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL

There is a very simple script to accomplish this and this can be really helpful for generating stats on a week-by-week basis:

SELECT DATEADD(wk, DATEDIFF(wk, 0, GetDate()), 0)

Replace GetDate with a datetime column and you could generate, for example, a sales report, week-by-week.

Thursday, 10 July 2008 12:25:54 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL
 Wednesday, 02 July 2008

This isn’t necessary a critical part of a DBA’s job but, at times, it can be useful to have an idea of how many rows are in your databases.

The simplest way to get it is with this query:

select sum(rowcnt) from sysobjects, sysindexes
where sysindexes.id = sysobjects.id and sysindexes.indid in (0, 1) and sysobjects.xtype = 'u'

This will get you the sum of rows in the entire database, for users objects only.  If you want the table-by-table breakdown, you can simply add the name of the object in the query:

select sysobjects.name, sysindexes.rowcnt from sysobjects, sysindexes
where sysindexes.id = sysobjects.id and sysindexes.indid in (0, 1) and sysobjects.xtype = 'u'
order by sysobjects.name

Wednesday, 02 July 2008 09:43:51 (Eastern Standard Time, UTC-05:00)  #    Comments [0] -
Code Snippet | SQL

Navigation
Advertisement
About the author/Disclaimer

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2017
Stanislas Biron
Sign In
Statistics
Total Posts: 135
This Year: 0
This Month: 0
This Week: 0
Comments: 1
All Content © 2017, Stanislas Biron