Author Topic: FreeBASIC CWstr  (Read 13790 times)

0 Members and 2 Guests are viewing this topic.

Offline Jeff Marshall

  • Newbie
  • *
  • Posts: 31
Re: FreeBASIC CWstr
« Reply #180 on: January 20, 2019, 05:26:07 PM »
Yes, it looks like it only has to do with line endings.

One part of the problem is how fbc project is configured compared to how your system is configured.  And maybe, you are using a .gitattributes file locally to try and manage this CRLF/LF problem.  Not sure.  Regardless, it's the line endings that are causing the grief.

The other part, is that last summer, I applied a patch from SARG and still got mixed line endings stored to the repo.  Some of the changed files you are seeing are all the files that got merged from pull request 96.  https://github.com/freebasic/fbc/pull/96 (like emit.bi edbg_stab.bas).  Where we are just now seeing the LF -> CRLF conversion.

So,

We do want every file we get from freebasic/fbc to have CRLF line endings on windows, and LF on linux.

From fbc wiki https://www.freebasic.net/wiki/wikka.php?wakka=DevGettingTheSourceCode
: The recommended setting for core.autocrlf is true, so that the FB source code in the working tree will have CRLF line endings.

Locally, for a single repo, looks like setting autocrlf can be stored in ./git/config file.  (git config --local).  Need to make sure that works like I think it should.

From this site: https://ywjheart.wordpress.com/2017/03/22/autocrlf/
: use core.autocrlf = true if you plan to use this project under Unix as well

Also, if the setting autocrlf has been changed, or in .gitattributes, then the changes may not get seen right away
Info in https://help.github.com/articles/dealing-with-line-endings/

heh, I'm trying.  Also, the development tree is shared by many people, so I keep most scripts, batch files, IDE files, tools, etc, outside of the fbc directory.

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #181 on: January 20, 2019, 07:35:46 PM »
Well, you can set autocrlf systemwide, global and local. All of these settings end up in a config file in the corresponding location. As i understand it setting autocrlf to true doesnīt affect .bas/.bi files, itīs for .txt files (at least for me it doesnīt convert -bas/.bi files at all - i tried it). Setting it to false removes all .txt files from the list of files GIT reports to have been modified. So i donīt have to deal with them anymore.

My IDE cannot work with LF line endings (which btw isnīt Windows standard), so i must convert them to CRLF. I tried .gitattribute, but you told me not to use it. Now i wrote a little utility doing that for me (all files in src\compile and all new files i added). Copying these files (with CRLF line endings) to Linux (Ubuntu64 on a virtual machine), didnīt cause any problems there, neither did the test files.

Which files must have LF line endings ? Most files in src\compiler already have CRLF line endings anyway, itīs only a few, which i must convert. If i had a list of them, i could write an (expandable) utility for converting them from CRLF to LF (and vice versa) as needed. A simple .txt file for listing directories or single files could supply input and the utility would convert these files for Windows or Linux. What do you think ?


JK

Offline Jeff Marshall

  • Newbie
  • *
  • Posts: 31
Re: FreeBASIC CWstr
« Reply #182 on: January 20, 2019, 09:18:42 PM »
core.autocrlf affects all TEXT files.  Which could be ANY filename.ext, that does not have binary content.  With correct autocrlf setting, and no extra .gitignore or .gitattributes, I don't think you should have to write scripts to deal with line endings.

Easiest way to see the difference is do a new clone & checkout, forcing the autocrlf setting for comparison.

Clone & checkout into fb.0 with whatever line ending was stored in the repo (should be all LF):
git -c core.autocrlf=false  clone https://github.com/freebasic/fbc fb.0

Clone & checkout into fb.1, converting all TEXT files to CRLF line endings:
git -c core.autocrlf=true clone https://github.com/freebasic/fbc fb.1

autocrlf setting is also important when making commits back to fbc repo.
When committing to from windows, autocrlf=true (CRLF's get converted back to LF's)
When committing to from linux, autocrlf=false (line endings get stored as-is which should already have been LF)

The only files that must be LF are
./manual/cache/*.wakka (raw data files from wiki)
./tests/file/big-list-eol-lf.txt (a specific test that needs LF on all targets)

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #183 on: January 20, 2019, 09:58:20 PM »
Quote
Easiest way to see the difference is do a new clone & checkout, forcing the autocrlf setting for comparison.
thatīs exactly what i did! Regardless of the autocrlf setting, i always have some files with LF line endings after cloning in src\compiler - i spend days trying to get GIT do what i want (CRLF as line endings right after cloning), it just didnīt work. Finally i gave up on it and it took me ten minutes to write own code fixing the issue.

Quote
The only files that must be LF are
./manual/cache/*.wakka (raw data files from wiki)
./tests/file/big-list-eol-lf.txt (a specific test that needs LF on all targets)
I didnīt touch any of these files, nor were any of them part of a commit, so there should be no problem.

What about just testing and evaluating my changes and additions, while i on my part work on a solution to preserve line endings just as they are in the master branch? Or why not have line endings for the source code consistent - either all LF or all CRLF, instead of having some with LF and having others with CRLF?

Or add a file e.g "rules.txt" specifying rules like naming conventions, required line endings, exceptions from the rule and the like for the root folders of the repo. This would make things clear once and for good.


JK

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #184 on: January 20, 2019, 11:11:50 PM »
Re-reading the info in https://help.github.com/articles/dealing-with-line-endings/, what about a .gitattributes file in the master branch? Properly configured this would solve the problems we have for good too, because .gitattributes did work for me - but you advised me not to use it. I understand now that my implementation was improper for general use, even if it worked for me.

But it should be possible to setup a generic .gitattributes file. Because of the fact that it overrides other GIT settings, it would make clones independent of individual user settings in this repect and ensure proper line endings everywhere.

Please, think about it.


JK

 
« Last Edit: January 20, 2019, 11:16:17 PM by Juergen Kuehlwein »

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #185 on: January 21, 2019, 10:44:20 PM »
Jeff,

i ran further tests showing that .gitattributes could really be the solution. As already mentioned above setting autocrlf to true doesnīt convert all .bas/.bi files properly to crlf - emit.bi and others keep failing!

Adding a .gitattributes at root level like this

Code: [Select]
# Auto detect text files and perform LF normalization
* text=auto
# BASIC source files (need cr/lf line endings)
*.bas text
*.bi text
*.inc text
*.h text
*.bat text
*.txt text
*.jkp text
*.rc text

*.gif binary
*.jpg binary
*.png binary

./manual/cache/*.wakka eol=lf
./tests/file/big-list-eol-lf.txt eol=lf

does the trick. Astonishingly enough all other problems i had with .txt and .sh files seem to be gone as well.

You may have to adapt and refine what i posted here, but in my opinion this the way to go in order to avoid the mess we currently have. It does what i already proposed above, it allows for setting rules for line endings per file type, it s independent of user settings because it overrides them. That is, it forces consistent line endings for the corresponding OS. On the contrary to autocrlf it works - and you can control it.


JK

Offline Jeff Marshall

  • Newbie
  • *
  • Posts: 31
Re: FreeBASIC CWstr
« Reply #186 on: January 24, 2019, 11:53:46 PM »
Hey JK, yes, I've seen all your updates.  It's only been a few days, dude!  Please allow me some time to review and respond.  I really want to get the 1.06 release out first.  Then we can be more liberal about what gets merged in to fbc/master.  Just after a release is a good time to add new features.

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #187 on: January 25, 2019, 03:17:16 PM »
Quote
I really want to get the 1.06 release out first

Yes, do that! (and then letīs go on ... :-))


JK

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #188 on: April 07, 2019, 11:41:32 AM »
Hi José,


after a little break here is an update of the current situation:

After the 1.06 FreeBASIC release Jeff basically agreed to merge in my pull request. There might be changes to some additions (extra string and array handling functions) but he definitely supports the integration of USTRING into the compilerīs code.

That is, we can finally get rid of the need to prepend "**" for variables and expressions at all. All of these changes work with my implemetation (CWSTR stripped down to the bare minimum) as well as with your original CWSTR. Existing code doesnīt have to be changed, you can still use "**", but you donīt need to anymore.


My code differs in two minor points from yours (CWSTR):
- i had to move "m_pBuffer" in first place of the member variables, this makes an improved "STRPTR" possible

Code: [Select]
TYPE DWSTR
  m_pBuffer AS UBYTE PTR                              'pointer to the buffer (in first place -> make strptr possible)

  Private:
    m_Capacity AS Ulong                               'the total size of the buffer
    m_GrowSize AS LONG = 260 * SizeOf(WSTRING)        'how much to grow the buffer by when required

  Public:
    m_BufferLen AS ulong                              'length in bytes of the current string in the buffer

STRPTR now returns a WSTRING PTR for Wstrings and Ustrings, which makes much more sense than a previously returned ZSTRING PTR. This change doesnīt break existing code, because you couldnīt use a ZSTRING PTR for processing a Wstring, you had to cast it to a WSTRING PTR or ANY PTR anyway.


- i had to remove the "CONST" qualifier in one place, this was necessary for some string handling functions to compile

Code: [Select]
    DECLARE OPERATOR CAST () BYREF AS WSTRING
    DECLARE OPERATOR CAST () AS ANY PTR


With these two small changes (which shouldnīt break anything) your CWSTR will be fully compatible to my compiler changes. Basically my USTRING implementation is written in a way, that (if pesent) WINFBX is the preferred way for adding dynamic wide strings. My code is only a fallback, if WINFBX is not used or cannot be used (e.g. LINUX).

Do you see any problems applying these changes to future releases of WINFBX?


JK

Offline José Roca

  • Administrator
  • Hero Member
  • *****
  • Posts: 2517
    • José Roca Software
Re: FreeBASIC CWstr
« Reply #189 on: April 08, 2019, 03:08:05 AM »
No problem at all.

Offline Jeff Marshall

  • Newbie
  • *
  • Posts: 31
Re: FreeBASIC CWstr
« Reply #190 on: April 28, 2019, 04:29:17 PM »
Hi Juergen, I have been reviewing your code last few weekends.

Your original code rebased on to current master I pushed here: https://github.com/jayrm/fbc/tree/jklwn-ustring
Overall, it's a big change to the compiler, though looks like you have made changes in the right places.  However, the logic in some places ignores cases that might be present.

I've been working on adding the changes step-by-step here: https://github.com/jayrm/fbc/tree/udt-wstring

I'm using this new syntax to indicate that a UDT should behave like a wstring (or a zstring):
Code: [Select]
type T extends wstring '' or zstring
  '' ...
end type

For testing, I am using a minimal implementation.  As far as the compiler changes go, I'm not really interested in how well the UDT works as a dynamic wstring, just that it works as wstring:
Code: [Select]
type UWSTRING_FIXED extends wstring
private:
_data as wstring * 256
public:
declare constructor()
declare constructor( byval rhs as const wstring const ptr )
declare constructor( byval rhs as const zstring const ptr )
declare operator cast() byref as wstring
declare const function length() as integer
end type

constructor UWSTRING_FIXED()
_data = ""
end constructor

constructor UWSTRING_FIXED( byval s as const wstring const ptr )
_data = *s
end constructor

constructor UWSTRING_FIXED( byval s as const zstring const ptr )
_data = wstr( *s )
end constructor

operator UWSTRING_FIXED.Cast() byref as wstring
operator = *cast(wstring ptr, @_data)
end operator

const function UWSTRING_FIXED.length() as integer
function = len( _data )
end function

operator Len( byref s as const UWSTRING_FIXED ) as integer
return s.Length()
end operator

I'm going step-by-step to determine what changes are needed, why they are needed, and fully test.  For example, test-suite for my branch not passing on Travis-CI due memory leaks I didn't notice (double free).  That could be due my changes, or due missing overloads the minimal implementation.

Also I have been investigating some of the long standing WSTRING bugs posted on sf.net while working on review of your code.

A requirement that the m_pBuffer element be first element should not be required.  Instead, fbc should be calling an operator overload to get the data's address if the user has defined one.  Or maybe it needs to be a requirement of the UDT that the user writes.

Offline Jeff Marshall

  • Newbie
  • *
  • Posts: 31
Re: FreeBASIC CWstr
« Reply #191 on: April 28, 2019, 07:56:35 PM »
Yeah, we are basically assuming that the existing wstring implementation is good, but I keep running in to annoying bugs.  Like this latest one, I found:
Code: [Select]
dim w as wstring * 50 = " "
dim r as wstring * 50 = trim(w)

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #192 on: April 28, 2019, 11:11:43 PM »
Jeff,


"EXTENDS WSTRING" in place of "#PRAGMA ..." is fine! But why not use the type in "ustring.bi" for testing? It is thoroughly tested, contains all necessary operators and overloads and thus helps avoiding type related errors.

Quote
A requirement that the m_pBuffer element be first element should not be required.  Instead, fbc should be calling an operator overload to get the data's address if the user has defined one.  Or maybe it needs to be a requirement of the UDT that the user writes.

m_pBuffer being the first element makes it possible to change the compilerīs code for "STRPTR" to work with WSTRING and USTRING. For both WSTRING and USTRING it returns a WSTRING PTR to the wide string data, this is different from what it did before (return a ZSTRING PTR for a WSTRING), which is - at least - undesirable.


JK 
« Last Edit: April 29, 2019, 12:04:12 AM by Juergen Kuehlwein »

Offline Juergen Kuehlwein

  • Full Member
  • ***
  • Posts: 118
Re: FreeBASIC CWstr
« Reply #193 on: April 29, 2019, 12:02:41 AM »
The problem with:

Code: [Select]
dim w as wstring * 50 = " "
dim r as wstring * 50 = trim(w)

is a "ssize_t" overflow problem. Adding two lines to "...rtlib\strw_trim.c" solves the problem:

Code: [Select]
        ...

chars = fb_wstr_Len( src );
if( chars <= 0 )
return NULL;

if( wcscmp(src, L" ") == 0)        //added
return NULL;               //added

        ...


JK

Offline Jeff Marshall

  • Newbie
  • *
  • Posts: 31
Re: FreeBASIC CWstr
« Reply #194 on: April 29, 2019, 12:39:32 AM »
I am still using your "ustring.bi" and "tests/ustring/*.bas" at each step to check that the changes to fbc still work with what you expect in "ustring.bi".  But "ustring.bi" is too complex to specifically test each change in the compiler, and most of the features in ustring.bi are not specifically needed to test the changes in fbc compiler.  So I am using a simplified UDT to test each compiler change.  In the end, "ustring.bi" will work too.

FYI, hides the problem, doesn't solve it.
Code: [Select]
if( wcscmp(src, L" ") == 0)        //added
return NULL;               //added
« Last Edit: April 29, 2019, 12:44:11 AM by Jeff Marshall »