SUPPORT ACCOUNT CLANS
Welcome, Unregistered.
 

Thread: LUA parser does not handle new lines and encoding properly.

Results 1 to 7 of 7
  1. #1

    LUA parser does not handle new lines and encoding properly.

    Hello,

    I just tried to update my mod and noticed there are two very bad bugs in LUA script parsing.

    Problem #1:
    Scripts that contain only new line characters (\n) and not carriage return (\r) can not be loaded.

    Solution #1:
    Ignore the presence of \r just like the XML parser does.

    Problem #2:
    LUA scripts are encoded in ANSI format. The LUA parser does not properly handle the UTF-8 BOM.

    Solution #2:
    Read the BOM and switch to UTF-x instead of ANSI just like the XML parser.

    Why?
    The Mod Manager uses only \n to be linux compatible and correctly saves files in UTF-8 format (not configurable). Since HoN does not load these files, it is impossible for mod authors to chance any LUA script without making a copy that is compatible and replacing the original code completely.

    Bangerz, I guess you are the man for this job. Could you please keep me up 2 date with anything in this direction?

    greetings
    Mertsch

  2. #2
    I mentioned that nearly a month ago in Shirkits thread (Java Mod Manager) and Notausgang posted in. Shirkit patched his ModManager so you can actually apply mods to lua files now. I think Notausgang did the same (but I'm not using his MM).
    Just get the newest versions and your Problem #2 will be fixed (maybe #1 too).
    :ChargedHammer: was my idea! prove here:
    http://www.playdota.com/forums/2254/...make-mjollnir/

  3. #3
    Quote Originally Posted by Manu311 View Post
    I mentioned that nearly a month ago in Shirkits thread (Java Mod Manager) and Notausgang posted in. Shirkit patched his ModManager so you can actually apply mods to lua files now. I think Notausgang did the same (but I'm not using his MM).
    Just get the newest versions and your Problem #2 will be fixed (maybe #1 too).
    Thanks for the info, but I am not willing to force people to use JAVA. Neither do I want to.
    And since this really is a bug in the system I rather wait for a real fix than working with tailored solutions.

  4. #4
    Offline
    Account Icon
    Join Date
    Sep 2009
    Location
    Salvador/Brazil
    Posts
    1,209
    Wait, you mean that the LUA parser doesn't accept UTF-8 Without BOM?

    Because I am writing a UTF-8 file without the BOM header. And I read and open it thinking that it's a UTF-8 file. This could load to some serious problems. All non-ascii 1 characters can be interpreted wrongly. And If I must detect what charset is being used on each file, I'll need an additional library to do that, and Notasatung also. As a plus, are the LUA files encoded in ANSI in Linux/Mac? I really doubt it, as ANSI is indeed the Windows-1252, mainly used on NT systems.

    Can anyone confirm if the Linux AND Mac files are encoded as Windows-1252 also?

  5. #5
    Quote Originally Posted by Mertsch View Post
    Thanks for the info, but I am not willing to force people to use JAVA. Neither do I want to.
    And since this really is a bug in the system I rather wait for a real fix than working with tailored solutions.
    I don't see how its a bug. Its not like there is official hon mod documentation stating that the hon lua parser supports a specific encoding type.

    Normally I'd say there wasn't a chance in hell S2 would change something this, but Bang seems to be the driving force being Lua and one of the few S2 guys that still has the spirit to make any notable tech changes.

  6. #6
    Offline
    Account Icon
    Join Date
    Sep 2009
    Location
    Salvador/Brazil
    Posts
    1,209
    Has anyone here talked to Bangerz about that? If all files are encoded in UTF-8, I think LUA files should be also.

  7. #7
    Quote Originally Posted by SHiRKiT View Post
    Wait, you mean that the LUA parser doesn't accept UTF-8 Without BOM?

    Because I am writing a UTF-8 file without the BOM header. And I read and open it thinking that it's a UTF-8 file. This could load to some serious problems. All non-ascii 1 characters can be interpreted wrongly. And If I must detect what charset is being used on each file, I'll need an additional library to do that, and Notasatung also. As a plus, are the LUA files encoded in ANSI in Linux/Mac? I really doubt it, as ANSI is indeed the Windows-1252, mainly used on NT systems.

    Can anyone confirm if the Linux AND Mac files are encoded as Windows-1252 also?
    Right now I can only assume that its Windows-1252 since there is no BOM and I have not seen any LUA with special characters you can guess on (not that I have searched). And if it's UTF-8 it should have the BOM.

    And its worse ... if the parser would read UTF-8 without BOM it would be OK, but it does misinterprets the BOM ... thats the big problem

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •