In general, Some files are in Windows ascii format and some in unicode format like &#xxxx;. Of course you can use scripting or lese to convert from one format to another.
----------------------- Ahmed ELsheshtawy IslamWare CEO www.islamware.com ----------------------- -----------------------
Assalamu alaikum warahmatullah!\
\
We are planing to build a database containing swedish translation of sahih bukhari & sahih Muslim. We need therfore the Unicode version av Sahih Bukhari and Sahih Muslim. We appriciate if there is some body who knows where we can find the Unicode version of therse books. \
\
Contact me please via:\
ikfm99@gmail.com\
\
Wajazakum Allahu khairan!\
\
Abu Iman\
Sweden
I just answered similar question about this, it is easy to do the ascii to unicode conversion for Arabic, here is some other topic that I found if you google for "arabic ascii to unicode"
UNICODE-ASCII-Arabic and conversion problems? Use this algorithm
So, you've used Microsoft Office applications like Excel or Access and SQL Server to store and retrieve Arabic text. There are times though, when your data seems to be corrupt or you get dummy latin characters when you're expecting the correct Arabic text. Providing configuration information regarding how to set SQL Server, Excel or Access to hold Arabic information correctly is not the topic of interest for this blog post. Instead, I'd like to provide you with an algorithm and sample VBA code to help you convert the latin text back to Arabic.
There are many reasons for the Arabic text to appear in Latin. One of the reasons may be your SQL Server Collation settings, or the fact that you have used a VARCHAR (or in this context CHAR, TEXT, etc.) when you're supposed to use NVARCHAR (or NTEXT, NCHAR, etc.). The N prefix in the type makes sure that the information in the column is stored in UNICODE and hence, your Arabic text will surely display exactly as it is everywhere. However, if you are getting something like ÇáßãÈíæÊÑ when you are supposed to get ÇáßãÈíæÊÑ, then it's already too late.
Good news for you guys! I have figured out the conversion table between such Latin characters and the corresponding Arabic text in Unicode. The algorithm is simple, there is a difference in the ASCII codes between the Latin text and those in Arabic (based on codepage 1256: Arabic Windows). Unfortunately, this is not a constant value, as the order of characters is not the same either. Hence, in order to convert those Latin letters into UNICODE, all you have to do is scan the text taking one character at a time, then get the UNICODE value for the character and add the deficit corresponding the range value from the table below:
ASCII range for Latin Text
ASCII range for Latin text (in hex)
Deficit between ASCII and Unicode for Arabic
From
To
From
To
192
214
C0
D6
1376
216
219
D8
DB
1375
221
223
DD
DF
1380
225
225
E1
E1
1379
227
230
E3
E6
1378
236
237
EC
ED
1373
Note: This table is a draft one. Although I tried all possible Arabic characters, including special ones, I'm not sure if this is the complete table. However, it is guaranteed that all textual characters are included.
Here's a VBA code you can use within Excel or Access to convert the Latin Text to Arabic. I have used Excel as an example:
Sub convert2ara() For Each s In Selection strNew = "" For i = 1 To Len(s.Text) j = AscW(Mid(s.Text, i, 1)) Select Case j Case &HC0 To &HD6: j = j + 1376 Case &HD8 To &HDB: j = j + 1375 Case &HDD To &HDF: j = j + 1380 Case &HE1: j = j + 1379 Case &HE3 To &HE6: j = j + 1378 Case &HEC To &HED: j = j + 1373 End Select strnew = strnew & ChrW(j) Next Cells(s.Row, s.Column + 1) = strnew Next End Sub
Here's also a T-SQL Stored Procedure that does the same thing:
CREATE PROCEDURE Convert2Ara @Latin VarChar(100), @Arabic NVarChar(100) = N'' OUTPUT AS BEGIN DECLARE @ind int, @Len int, @Src int SET @Len = Len(@Latin) SET @Ind = 1 WHILE @ind <= @Len BEGIN SET @Src = ASCII(SUBSTRING(@Latin, @ind, 1)) Set @Src = Case WHEN @Src BETWEEN 192 and 214 THEN @Src + 1376 WHEN @Src BETWEEN 216 and 219 THEN @Src + 1375 WHEN @Src BETWEEN 221 and 223 THEN @Src + 1380 WHEN @Src = 225 THEN @Src + 1379 WHEN @Src BETWEEN 227 and 230 THEN @Src + 1378 WHEN @Src BETWEEN 236 and 237 THEN @Src + 1373 END SET @Arabic = @Arabic + NCHAR(@Src) SET @Ind = @Ind + 1 END END
Try it and tell me if it works for you.
Special thanks to my friend, Mahdi Al-Saffar, whose request for some help inspired me to find a solution and publish this post.
Also on code websites like http://planetsourcecode.com you will find some other ready code for example if you just search for "arabic unicode" in Visual Basic:
You can also do it direct in most if not all languages, for example, in VB, you can use ASC, ASCW, Chr, ChrW
In Islamkit , I use also JS code for borwser convertion once the user types arabic and changes the form field in the QuranManager, here is the JS code for inspiration:
Salaam bro, Im really sorry to sound stupid but I dont understand how I can convert this:
þ þÍóÏøóËóäóÇ þ þÞõÊóíúÈóÉõ Èúäõ ÓóÚöíÏò þ þÍóÏøóËóäóÇ þ þÃóÈõæ ÚóæóÇäóÉó þ þÚóäú þ þÓöãóÇßö Èúäö ÍóÑúÈò þ þÍ þ
To Arabic I tried one of the links you gave but it didnt work. The characters above are exactly from one of the hadith books in Arabic. If you dont mind would you please help. ws abu
Salam all, l think l have found a solution to reading these files.
I basically changed the language settings on a Safari browser to Arabic language, and then dropped the Notepad file into the browser window. Voila, it opens in the Arabic alphabet.
From there, l think you can paste the text into a Rich Text format file (WordPad, free with Windows, and l think AppleMacs can read that format too).
That is what l have done and it appears to work. I will try to return to this forum and post my results, God-Willing.
P.s. I think you might have to change the language settings on Control Panel first, to enable the Arabic and other Eastern languages to be on the language menu on your various softwares. This would mean having the original Windows installation disk, Windows already knows which folder to find the relevant ".dll" files (they are needed to support the new languages you are opening your PC to).
Maybe l can only now view the Arabic language because of the Control Panel change l had made. Maybe on other people's PCs, the Arabic script still won't be visible?
Salam.
Salam. I have converted all but Sahih al-Bukhari from the weird letters into actual Arabic language, using the method outlined in my previous post (Change Safari browser settings to accept Arabic language, drop Notepad file into Safari browser, Copy entire document on Safari, Paste into WordPad, Save as .rtf or .doc). An alternative could be to open the Notepad file with Microsoft Word -- but this could cause bad screen freezes, when Word changes everything into Arabic language.
Please be aware that every step will involve major screen freezes unless you have a brilliant, fast, modern laptop.
By the way, the finished files were too large for me to post, sorry. One was over 50 Mb. And as l mentioned, Sahih al-Bukhari was too big to process. I'm guessing it would result in a 100 Mb .rtf or .doc file.
Anyway, have fun. Salam.