tag:blogger.com,1999:blog-2428374771421713311.post6781514811715902411..comments2024-03-10T12:04:17.661-07:00Comments on The Oracle at Delphi: Meanwhile, back at the (Unicode) ranchAnonymoushttp://www.blogger.com/profile/10119008505905401707noreply@blogger.comBlogger35125tag:blogger.com,1999:blog-2428374771421713311.post-37448699199965062642008-04-21T10:16:31.000-07:002008-04-21T10:16:31.000-07:00Hi, thanks for updating this blog.I studied your b...Hi, thanks for updating this blog.<br><br>I studied your blogs a little and came to the conclusion that the whole conversion to Unicode has major impact on my written software. Though I totally agree with the shift, is there any test version, alpha/beta, trial available to start rewriting or is it all too early?<br><br>Will it help to start rewriting it all to wide strings and later on all change it back to normal operations? I am using many functions that loop through characters based on increments of single bytes.<br><br>Will there be any fast general find/replace function and sorting routine? This would help a lot of rewriting difficulties. Any reaction appreciated.<br><br>Regards JasonJasonnoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-11054462352111479422008-01-30T05:23:46.000-08:002008-01-30T05:23:46.000-08:00C Johnson > I hope you meant LongInt. DWORD i...C Johnson > I hope you meant LongInt. DWORD is LongWord, i.e. unsigned.<br><br>Q: 64-bit signed integer will be .... LongerInt? ;) LOL<br><br><br>But it won't be a "problem" if Integer becomes 64-bit, unless you have code that directly relies on the max/min of a 32-bit Integer. At least, not a problem in the same way as with ANSI/Unicode, i.e. inherent (and silent) data loss.Jolyon Smithhttp://www.deltics.co.nznoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-52638652009431833022008-01-31T04:59:52.000-08:002008-01-31T04:59:52.000-08:00Sorry, it wasn't at all clear that you were re...Sorry, it wasn't at all clear that you were referring specifically to string lengths. You appeared to be talking about the use of "Integer" type variables, not a specific instance of a current "Integer" value in the RTL.<br><br>But anyway, how useful, really, is a 2GB string? (2Giga WideChars = 1 Giga Unicode characters).<br><br>I don't think it would be unreasonable to leave the length component of String RTTI as a 32bit signed Int, if that helps with compatability (although I agree the used of a signed value for something that can never be negative is something of an anomaly).Jolyon Smithhttp://www.deltics.co.nznoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-83591260446530986962008-02-01T01:42:16.000-08:002008-02-01T01:42:16.000-08:00A K, Yes, with some qualifications. Any RTL func...A K,<br><br> Yes, with some qualifications. Any RTL function you call that takes "var" string parameters may not work if the param is Unicode string and you try to pass an AnsiString. Also, event handlers should not be changed. Other than that, most things should continue to work.<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-18433226292027869752008-01-28T19:43:01.000-08:002008-01-28T19:43:01.000-08:00When outputting files is it possible to control if...When outputting files is it possible to control if a BOM is written or not? Some applications won't recognise/expect a BOM and will not work correctly if they encounter one.David Howesnoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-43050570517895580712008-01-28T18:05:07.000-08:002008-01-28T18:05:07.000-08:00What if I want the strings stored internally in a ...What if I want the strings stored internally in a TStringList to be ASCII/ANSI strings??? (i.e. I have a long list of keywords or an english dictionary word list which has no need for unicode) - storing it in the new TStringList will immediately double my memory consumption for exactly the same data... how do we handle this?<br><br>Or will there now be a TAnsiStringList class that we should use instead???Concernednoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-61253270195007842412008-01-28T16:49:48.000-08:002008-01-28T16:49:48.000-08:00I'd prefer to have UTF8 as the default encodin...I'd prefer to have UTF8 as the default encoding when saving a StringList. If the default is ANSI and somebody forgets the additional parameter, this would easily result in a "unicode loss" bug.Sebastian Znoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-46221621677049235092008-01-28T23:25:36.000-08:002008-01-28T23:25:36.000-08:00Thanks Allen,The unicode files without BOM is very...Thanks Allen,<br>The unicode files without BOM is very important issue for some topics. For instance, BOM is a problem in php files. So both bom reading and writing should be optional property or parameter.Fatih Tolga Atahttp://www.diyezon.comnoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-18419124676261339822008-01-29T01:24:33.000-08:002008-01-29T01:24:33.000-08:00Concerned, "Or will there now be a TAnsiStri...Concerned,<br><br> "Or will there now be a TAnsiStringList class that we should use instead???"<br><br>We'll certainly consider this.<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-38805166082643741302008-01-29T01:26:35.000-08:002008-01-29T01:26:35.000-08:00"When outputting files is it possible to cont..."When outputting files is it possible to control if a BOM is written or not?"<br>"The unicode files without BOM is very important issue for some topics."<br><br>I will have to check on this about whether or not we'll allow writing the files without the BOM. I'd imagine there would be a way to do it, and if not right now, I'll add it as a suggestion.<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-50282745543142390712008-01-29T01:29:57.000-08:002008-01-29T01:29:57.000-08:00Sebastian, The presumption here is that just beca...Sebastian,<br><br> The presumption here is that just because the strings are Unicode, your application doesn't magically start injecting Unicode characters into them. If your application was read/writing ANSI data, the conversion is loss-less since the code-page remains constant. The only time you may encounter loss is if there is a mismatch between the reader's and writer's codepage.<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-25542934270359150622008-01-29T01:43:02.000-08:002008-01-29T01:43:02.000-08:00Will methods like LoadFromStream and SaveToStream ...Will methods like LoadFromStream and SaveToStream will be overloaded as well?Bruce McGeenoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-48104193986326141682008-01-29T02:15:51.000-08:002008-01-29T02:15:51.000-08:00Bruce, "Will methods like LoadFromStream and...Bruce,<br><br> "Will methods like LoadFromStream and SaveToStream will be overloaded as well?"<br><br>Yes. I should have been more clear. I intended to indicate that by the reference to that group of functions by "TStrings.ReadFrom/WriteToXXXX."<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-81755438713018843692008-01-29T03:43:58.000-08:002008-01-29T03:43:58.000-08:00Clinton, Yes. That function will move just fine....Clinton,<br><br> Yes. That function will move just fine. No changes are needed at all.<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-71900484175977827492008-01-29T04:02:10.000-08:002008-01-29T04:02:10.000-08:00Will the new Unicode work under Windows NT, or wil...Will the new Unicode work under Windows NT, or will it be strictly for Win2k and above. I don't mind that Win9x (Windows Playstation?), won;t be supported.Bruce McGeenoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-85357250198677953302008-01-29T04:10:42.000-08:002008-01-29T04:10:42.000-08:00Bruce, We're still evaluating whether or not ...Bruce,<br><br> We're still evaluating whether or not we'll certify targeting of NT. I would imagine that it should work since NT was Unicode from the start. However, there may be some APIs and functionality that do not exist on earlier NT versions which may make some things incompatible. Anything before NT4 SP4+, I highly doubt would work well.<br><br>Allen.Allen Bauerhttp://blogs.codegear.com/abauer/noreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-25117739680072477492008-01-29T04:25:28.000-08:002008-01-29T04:25:28.000-08:00Thanks. If NT4 is supported, then I think it'...Thanks. If NT4 is supported, then I think it's reasonable to expect users have at least SP4(a) installed. The only ones I have to worry about are fully patched.Bruce McGeenoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-42457284596802803592008-01-29T06:24:01.000-08:002008-01-29T06:24:01.000-08:00"We’ll certainly consider this." (TAnsi..."We’ll certainly consider this." (TAnsiStringList)<br><br>This does not inspire confidence. Unicode is a "must have" for the future, but it's an utter irrelevance for most existing Delphi applications.<br><br>TStringList should be ANSI.<br><br>TUnicodeList should introduce a new list-class for supporting Unicode strings (TUnicodeStringList - would contain unnecessary redundancy in the name IMHO - what other Unicode things might be in a list? Unicode Integers?).<br><br>By definition ONLY NEW Delphi applications will make use of Delphi Unicode support - forcing existing applications to jump through hoops simply to work as they did before is a recipe for losing upgrade sales.<br><br>I'd rather wait a little longer for a successful Unicode delivery than get an early one which does not provide a practicable transition for existing, strictly ANSI, applications (which again, by definition, is surely pretty much ALL existing Delphi applications).<br><br>Unless there is a radical rethink this could be fatal mis-step for Delphi/CodeGear. I fear it may be too late for the re-think that is required though.<br><br>:(Jolyon Smithhttp://www.deltics.co.nznoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-25027603361788094542008-01-29T10:40:46.000-08:002008-01-29T10:40:46.000-08:00"We'll certainly consider this." (TA..."We'll certainly consider this." (TAnsiStringList)<br><br>No. What must be added is <br>1. TStringList.SaveToFileEx(FileName: string, BS: TBOMStrategy)<br>2. TStringList.DefaultBOMStrategy: TBOMStrategy.<br><br>Allen, you have mentioned in previous post, that a switch between UnicodeString and AnsiString means double effort. But you see, sometimes double effort cannot be avoided. ;-)Qian Xunoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-79355280798071129322008-01-29T12:06:39.000-08:002008-01-29T12:06:39.000-08:00Forcing existing Delphi applications to Unicode wo...Forcing existing Delphi applications to Unicode would be a VERY_BAD_THING. As Jolyon said, D2007 could be the last Delphi bought by a large percent of your users.Pavel Snoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-53811224372067610402008-01-29T14:45:01.000-08:002008-01-29T14:45:01.000-08:00It'd be desirable to have the ANSI encoded str...It'd be desirable to have the ANSI encoded string fields in a database:<br> <br>TFieldType = (ftUnknown, ftString, ftAnsiString{!!!}, ...<br><br>In some cases it'd be the big memory saving.Kryvichnoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-59114631100600775902008-01-29T18:54:58.000-08:002008-01-29T18:54:58.000-08:00I will need some way to override the BOM, especial...I will need some way to override the BOM, especially for XML.<br><br>Otherwise, I'm really happy with how CodeGear is going about this, and I don't see any big migration problems. And trust me, I'm concerned about forward and backward compatibility.<br><br>I suspect that my biggest issue will be replacing PChar with PByte in a couple of places and adding some IFDEFS for previous versions of the compiler.Bruce McGeenoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-16293313436826834612008-01-28T06:28:50.000-08:002008-01-28T06:28:50.000-08:00Thanks for this great information Allen. I still d...Thanks for this great information Allen. I still do have two questions:<br><br>1. Can TMemIniFile.ReadString / WriteString handle characters up to #255 which represent binary data instead of text? Maybe this can be accomplished with a custom TEncoding class?<br><br>2. Do you have any thoughts on the Text type (WriteLn etc.) ?Gielnoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-50586645929085820092008-01-28T13:12:15.000-08:002008-01-28T13:12:15.000-08:00Really good.In our application, we hvae our own Un...Really good.<br><br>In our application, we hvae our own UnicodeFile unit. and INI file need to be Unicode very much for those string parameters the user define!!<br><br>Hope we can test this early.Bear Xunoreply@blogger.comtag:blogger.com,1999:blog-2428374771421713311.post-71679732033793449202008-01-28T13:13:24.000-08:002008-01-28T13:13:24.000-08:00What about files that are clearly Unicode but have...What about files that are clearly Unicode but have no BOM? Many unicode libs have functions that auto-detect unicode. Will Delphi?David Howesnoreply@blogger.com