Wednesday, 18 November 2009

Automatically generating Hebrew transliteration

This is largely independent of what transliteration scheme we choose, so
broken it off into a separate thread.


David IB wrote:
> You also made the very important point that it must be possible to
> automatically generate
> this transliteration from Hebrew and back to Hebrew again.
> I think this is possible, with a couple of caveats:

Just for clarity, I don't think we'll be able to get back to the Hebrew,
but won't need to either. As well as dageshes, we're going to lose some
vowel distinctions (shewa-based vowels notably).

> First automatic generation won't distinguish between /qames/ and /qames
> hatuph
> /(ie when does the T-shaped vowel sound like an "a" and when does it
> sound like an "o")
> I don't think I can come up with an algorithm which will do this
> automatically.
> We have two choices:
> 1) ignore it and simply use "a" (which is fine in the vast majority of
> cases)
> 2) generate a whole OT text with correct "a" and "o" (I can do that) and
> manually change the half-dozen or so lexical entries which need it

I think we do want the o if at all possible. Sounds like you've got a
method which does the hard bit of working out which qamats in the text
is which!

Colin

No comments:

Post a Comment