Mysterious segmentation rules
De persoon die dit onderwerp heeft geplaatst: Heinrich Pesch
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 04:16
Lid 2003
Fins naar Duits
+ ...
Nov 9, 2006

One thing that always bothers me when working with Wordfast is the strange behaviour it shows when chosing the segmentation point.
I know I can somehow influence the rules by adding items to the abbrivation list and chosing other rules than the recommended options, but why can it not use common sence in the first place?

Just now I had a sentence with "e.g. a fixed ". Wf first segmented after the first full stop, then I press to enlarge the segment, but it stops at the next ful
... See more
One thing that always bothers me when working with Wordfast is the strange behaviour it shows when chosing the segmentation point.
I know I can somehow influence the rules by adding items to the abbrivation list and chosing other rules than the recommended options, but why can it not use common sence in the first place?

Just now I had a sentence with "e.g. a fixed ". Wf first segmented after the first full stop, then I press to enlarge the segment, but it stops at the next full stop, after the g., and only after that does it segment the whole sentence.
When I look at the segmentation tools, the option "sentence" is not recommended. Why not? What would happen?
I would like Wf to segment always, when after a full stop comes a space and after that an initial, where the next sentence starts. But very often Wf grabs two or three sentences into one segment. On the other hand it stoopidly segments at places like "312.15.67", which is rather annoying.

What segmentation rules have you tried out and what do you use?

Regards
Heinrich
Collapse


 
Valters Feists
Valters Feists  Identity Verified
Letland
Local time: 04:16
Engels naar Lets
+ ...
experiment with finer settings... Nov 9, 2006

I too do use manual expanding of segments quite often. It's not that terrible if you know the keyboard shortcuts (alt-pgdown).
Are you also aware of manual page breaks versus paragraph-end characters, non-breaking spaces versus normal spaces?

I think you can fiddle with the Wf's setup->segs->end-of-segment punctuation + abbreviations. You can enter your own items in the abbreviations and the ESP boxes (have to do it carefully). This could depend on Wf's version though.
A
... See more
I too do use manual expanding of segments quite often. It's not that terrible if you know the keyboard shortcuts (alt-pgdown).
Are you also aware of manual page breaks versus paragraph-end characters, non-breaking spaces versus normal spaces?

I think you can fiddle with the Wf's setup->segs->end-of-segment punctuation + abbreviations. You can enter your own items in the abbreviations and the ESP boxes (have to do it carefully). This could depend on Wf's version though.
A while ago I was looking for a way of making the regular space character to act as a segment delimiter (so that one TU = one word) -- which unfortunately doesn't seem to be possible and I have to resort to an oblique replace-all-and-later-unreplace-all routine.
Apparently there are some things in Wf that you just can't be in control of. :-/

Regards,
Valters Feists
Technical Latvian translator
Collapse


 
Gerard de Noord
Gerard de Noord  Identity Verified
Frankrijk
Local time: 03:16
Lid 2003
Engels naar Nederlands
+ ...
Don't make any special settings Nov 9, 2006

Hi Heinrich,

You shouldn't make any special settings, if you want your segments to be Trados compatible. Full sentences aren't.

When you encounter e.g., select those four characters and push Ctrl+ALt+T to add the abbreviation to the list of abbreviations. The text will be resegmented.

Regards,
Gerard


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 04:16
Lid 2003
Fins naar Duits
+ ...
ONDERWERPSTARTER
Why Trados compatible? Nov 9, 2006

I rather would it be common sense compatible

If there is a full stop plus a space plus an initial I really would like the segmenting take place there and not two sentences later.
Can anybody explain why sentence segmentation is not recommended?
According to my experience Wf segmentation rules differ from Trados at least in lists, where the items are numbered German fashion.
1.)
2.)
etc.... See more
I rather would it be common sense compatible

If there is a full stop plus a space plus an initial I really would like the segmenting take place there and not two sentences later.
Can anybody explain why sentence segmentation is not recommended?
According to my experience Wf segmentation rules differ from Trados at least in lists, where the items are numbered German fashion.
1.)
2.)
etc.

Brackets are a problem too for Wf. Often I encounter situations, where a ")." is left to the next segment, and I cannot get Wf to segment after it, instead it jumps to and fro too far or too short.

Perhaps the reason for this is the fact that Wf is French, and the French have some strange punctuation rules, if I remember right?

Regards
Heinrich

[Bearbeitet am 2006-11-09 18:53]
Collapse


 
Philippe Etienne
Philippe Etienne  Identity Verified
Spanje
Local time: 03:16
Lid
Engels naar Frans
I love it Nov 9, 2006

Heinrich Pesch wrote:

I rather would it be common sense compatible
...


I am afraid the meaning of common sense is somewhat lost nowadays...
Thanks for the laugh
Philippe


 
Valters Feists
Valters Feists  Identity Verified
Letland
Local time: 04:16
Engels naar Lets
+ ...
Wf 3.35 - more or less the following settings... Nov 10, 2006

In setup/segs:

1) Add e.g. to the list of abbreviations.
Your list can be for example "Inc.,Corp.,Ltd.,e.g." (separate with commas)
2) You can leave . (full stop) in the ESP box,
3)...but make sure you uncheck "An ESP without a trailing space ends a segment",
4) uncheck also "An ESP + a space + a lowercase end a segment".

P.S.
In French punctuation, a space character comes before exclamation and question marks; it also separates quote mark
... See more
In setup/segs:

1) Add e.g. to the list of abbreviations.
Your list can be for example "Inc.,Corp.,Ltd.,e.g." (separate with commas)
2) You can leave . (full stop) in the ESP box,
3)...but make sure you uncheck "An ESP without a trailing space ends a segment",
4) uncheck also "An ESP + a space + a lowercase end a segment".

P.S.
In French punctuation, a space character comes before exclamation and question marks; it also separates quote marks from words, e.g., « merci ! » .

Regards,
Valters Feists
Technical Latvian translator
Collapse


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 04:16
Lid 2003
Fins naar Duits
+ ...
ONDERWERPSTARTER
Thanks Valters Nov 12, 2006

I implemented the settings you suggested. So far I haven't noticed any changes. Wf continues to segment two sentences into one segment in certain places.
But when I try Trados 7,5, I notice Trados has no difficulties with the same text. So Wf segmentation rules are definitely not trados-compatible.
The same text was translated also on another machine with a different Wf and different settings, and the result is the same an in my case.

Sorry I cannot cite the text, as it
... See more
I implemented the settings you suggested. So far I haven't noticed any changes. Wf continues to segment two sentences into one segment in certain places.
But when I try Trados 7,5, I notice Trados has no difficulties with the same text. So Wf segmentation rules are definitely not trados-compatible.
The same text was translated also on another machine with a different Wf and different settings, and the result is the same an in my case.

Sorry I cannot cite the text, as it is confidential, but they are normal sentences of the ". T"-model.

Regards
Heinrich
Collapse


 
Valters Feists
Valters Feists  Identity Verified
Letland
Local time: 04:16
Engels naar Lets
+ ...
could it be...? Nov 12, 2006

Could it be that the sentences are separated by a non-breaking space (nbs) character instead of simple space? Check it by switching the inverted "P" button to on; the nbs characters then will be shown as degree characters, and normal spaces as middle dots. I think Wf cannot be trained to handle nbs's... my option would be to replace-dereplace them while translating.

 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 04:16
Lid 2003
Fins naar Duits
+ ...
ONDERWERPSTARTER
No, they are normal spaces Nov 12, 2006

This phenomenom is so common that I did never think about it before, almost every document has such stombling stones for no apparent reason, except that the sentences include numbers and abbreviations.
Cheers
Heinrich


 
Mick De Meyer
Mick De Meyer  Identity Verified
België
Local time: 03:16
Engels naar Nederlands
+ ...
More weird segmentation, any help? Apr 17, 2011

Hi all,

Here is a specific problem I'm having with segmentation: the curly brackets.

I often need to translate this kind of sentence string:

Sentence number one.{end_li}{li}Sentence number two.{end_li}{end_para}{breakline}{para}Sentence number three.

... and so on. However, Wordfast mysteriously decides that this is a segment:

Sentence number one.{

How on earth can I simply te
... See more
Hi all,

Here is a specific problem I'm having with segmentation: the curly brackets.

I often need to translate this kind of sentence string:

Sentence number one.{end_li}{li}Sentence number two.{end_li}{end_para}{breakline}{para}Sentence number three.

... and so on. However, Wordfast mysteriously decides that this is a segment:

Sentence number one.{

How on earth can I simply teach it to segment before the curly bracket? How is it at all logical to end a segment with an open bracket?

If anyone knows how this is done, I would be very grateful!
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Mysterious segmentation rules







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »