Re: Hebrew transliteration

Tyndale STEP Project wrote:
[a fair bit which I'm entirely happy with has been excised. Will address
the qamats issue in a separate post]

> Typing is the main reason I have avoided dots under letters and complex
> accents over letters like S.
> I have aimed at something which uses only the extended ASCII character set.

Which extended character set are you looking at? I haven't found one
which contains underlined consonants - so at least for display and
typing there seems to be no difference between underline and underdot.
(Agreed for typing in that going for capitals makes sense.) Underline
has one advantage in that it's easier to see. However it has one
disadvantage in that other sources use it for bgdkpt differences, but
that's livable with.

> You asked:
>> * how are vocal schwa and the schwa-plus vowels represented?
>
> I don't think there is any need to distinguish because the the
> shewa+vowel is simply the way that the vowel is written under letters
> like Aleph
> - ie every time a vowel occurs under an Aleph etc, a shewa is added. So
> we only need to record the vowel itself.

Unless I've misunderstood what you're saying, this isn't true. (Look at
"elohim" and "erets" in Gen 1:10 for instance - both with segols on
alephs, one with a shewa, one not.) However, I think it's a further
distinction in the transliteration we can disregard.

> I don't share your concern that people are going to confuse this system
> with others,
> because other system don't use underlining and they have lots of funny
> accents.
>
> You make a good point about samek being normally "s".
> Do you think it would be better if /sin/ were _s_ and samek s ?
>
> I'm not sure about confusion with _t_. I think people will quickly get
> used to the idea
> that letters which normally have a dot under them in other systems tend
> to be underlined in this one.

Your first and third paragraphs can be paraphrased as:
* people are going to regard our system as completely different from the
standard ones.
* people are going to regard our system as the same as the standard ones
but with underlines replacing dots.

Which do you want? ;)

I don't know why I'm failing to convince you that when, as far as I'm
aware, _every single book in English_ which uses a scholarly-type
transliteration maps the consonants in almost the same way, doing
something which is subtly different (for a marginal advantage) is going
to confuse and frustrate people the minute they encounter other books.
I'll have one more go and then shut up.

1. Some changes between systems are fine. So if we _consistently_ use
underlines where other books use dots, that's an easy rule to remember.
(Particularly as they'll have to remember underline->capital for typing
things in.) So it's not a major problem that waw is w in some sources, v
in others, because they can mentally say "v and w" are the same.

2. Similarly, if we drop '` for aleph/ayin consistently, and use ' for a
vocal break instead, that's fine. (We'd presumably drop the ' when doing
the search, so people could type in ...aim words without the '.)

3. When we're remapping individual letters, however, this is going to
throw people. We expect (and hope) that people using our system will
internalise the transliteration and probably learn key words as well.
However, if they have learnt _h_e_s_ed from our system, and then
encounter h.esed in a book, they're going to do a double-take and think
"is this the same word or not" because their natural assumption is to
expect h.es.ed. Things are going to be worse coming the other way.
People are going to read a Hebrew transliterated word in a book, and
want to know what STEP says about it, so type the word into our browser
as Hesed (to type into STEP "convert dots -> capitals and ignore all
accents on vowels") and get back "word not found". And I would expect
even experts to make this mistake regularly. To be able to use STEP
alongside other resources, you're requiring people to carry around two
different transliteration schemes and some fairly subtle differences to
map between the two, and this is lot of mental baggage that could be
more usefully employed in thinking about the theological point at issue!

4. It looks like accessing STEP resources by type-in is our worst
use-case, so let's consider what the major differences between
transliteration and type-in are:

a. dots (in standard)/underlines (in our system) -> capitals. Ignore
bgdkpt underlines in standard system for the (rare) sources that have
them. Nice and consistent. No problem.

b. ignore accents on vowels. Ditto. This gets rid of most of the
differences between transliteration schemes.

c. different use of '`. In general this isn't a problem, because we can
just tell people not to type in the '` (or ignore it if they do). The
only disadvantage is we lose the distinction between words that are
identical except for having an aleph or ayin. I don't think there are
too many of these though.

d. yod and waw as vowels.

e. different consonants and the multiple S problem. (In the standard
system, sin and shin are the only consonants written with an accent
other than dot.)

On d: we're proposing to render (eg) tsere-yod as ëy. Other systems vary
in whether they render these as "ey" or "e", with varied accents. It
would be helpful if people could type in the "e" form and get the right
entry back. This should be easy to do.

On e: if we were to adopt the standard system, we would need to find a
means of entering sin and shin. (s is samekh, S is tsade), and so we
need to make a change here anyway. One common option seems to be {,},
however "sh" for shin and perhaps "ss" for sin would work better, and we
could reasonably use these for the transliterations too. (Although see
below on how lookup is going to work - this may scupper this idea.)

This is the minimal change and just requires people to remember the
correspodence between ss and sh and the s's with the various accents -
it also doesn't contradict the basic dot/underline->capital rule.

> You ask about bothering with yod and vav when used as a vowel.
> I think we need to mark these for looking up as a concordance and lexicon.
> If they double-click on '_i_r (ie 'city') we need the software to look
> up ayin-yod-resh,
> so even if the user doesn't care whether the "i" is underlined, the
> software needs this information.

Where's this going to come from if the user types in the word "ir"? We
won't have the underlining information, or the ayin (it could have been
aleph). We're going to need a transliteration -> Hebrew table for these
entries, and probably also for words involving shin, so we know that sh
is shin and not sin (or samekh) + het.

Also, can I confirm we're requiring people to type in the vowels, and
won't permit "Hsd" for hesed? (If we permit consonant-only systems, we
have to be more careful with the transliteration to prevent ambiguities.)

So to summarise and try and focus future discussions, if given a free
rein, this is what I'd do, together with my opinion on proposed
differences. My starting point is the usual standards, and I'm trying to
make changes only where they're either needed or useful.

The ones I've marked with * are points which, to be honest, I now feel
sufficiently strongly about that I wouldn't bother trying to convince me
otherwise, and we'll just have to agree to disagree. I can work with
whatever you decide, however much I like it or not.

Consonants: start with the standard system
1. waw not standardised: w preferable, but v would be OK.
2. would probably prefer to mark bgdkpt differences (as some do) to aid
pronunciation, but not strongly.
3. underline instead of dot for display: seems a largely pointless
change, unless there are advantage for typing that I haven't seen.
4. dot/underline -> capital for typing in. Necessary change.
5. sin/shin -> "ss"/"sh" for typing in. Have to do something, and this
seems best.
6. sin/shin -> "ss"/"sh" in transliteration. Given 5 this seems a good
idea. Would be fine to stick with the standard accents though.
7. drop representation of aleph/ayin unless necessary for pronunciation.
Not sure if this is being proposed or not. Would prefer to keep them in
display, to link with what's in the Hebrew, but drop them for search.
Could live with losing them.
8.* Any other consonant changes. Don't gain us anything and just add
confusion. (So stick with samekh = s, tsade = _s_.)

Vowels: no one standard system to compare against.
1. I'd prefer long-short vowels to be marked (more information and easy
for the user to filter out) but no problem if we decide to drop this
information.
2.* However we mark vowels, it needs to be consistent, so if we're
distinguishing long/short vowels, use the standard circumflex for long
vowels, rather than the current mix (a circumflex is long, e and o
circumflex are short!) I'm not convinced that "visual link" to Hebrew is
a good reason for these differences, particularly as it's not a strong
link and above the letter.
3. yod as vowels. Adding y for yods seems reasonable and generally
reflects pronunciation. Would prefer to do it in all cases, but i
circumflex a reasonable alternative for i plus yod.
4.* Tagging long o and long u because they happen to be represented by
waws is pointless - anyone who knows any Hebrew will know this, anyone
who doesn't won't care, and it doesn't help us in searches (see above).
5.* Similarly ë for tsere. Pointless and an unnecessary difference from
(I think) everyone else.
6. Effectively dropping schwa from schwa+ vowels. Fine, and no obvious
sensible way to represent them otherwise.
7. Vocal schwah. Can be either e or '. Not sure what's being proposed.
Either would be fine - e is more standard, ' reflects pronunciation
better, so undecided.
8. ' for representing "non-dipthong" in eg hammaim ending. Using a
diaresis (hammaïm) is standard orthography (in linguistics generally,
including in NT Greek) so would recommend going for that - prevents
confusion with aleph/ayin as well. (I think we only need to do this with
a short i, correct? If we need to do it before a long vowel, obviously
this doesn't work so back to ').

Tyndale STEP - Programming

Wednesday, 18 November 2009

Re: Hebrew transliteration

No comments:

Post a Comment

Documents:

Blog Archive

About Me