• Register
Welcome to Kodlogs Q&A, programming questions and answer website.

Most popular tags

java program python php javascript android c# r mysql spring c spring-boot eclipse arrays ggplot2 python-3 exception tomcat sql-server numpy jdbc windows string html hibernate ssl sockets x keytool python-2 pandas sql 7 asp net java-8 intellij-idea ssl-certificate entity-framework macos minecraft ios csv facebook security json list class spring-mvc heap-memory reactjs for-loop scikit-learn winforms glmnet webpack illegalstateexception matlab redirect code program junit pip django maven docker ubuntu typeerror linux loops node math datetime js jquery cryptography loop int import https laravel-5 tsql unity3d fatal-error struts java-applet pytorch runtime-error conda visual-studio apache-spark pyspark garbage-collection amazon-web-services sql-server-2005 quirks-mode internet-explorer-9 internet-explorer reporting-services anova machine-learning keras indentation-error laravel unit-testing android-studio jupyter-notebook orm qt virtualenv webclient prediction visual-studio-2010 compiler-errors numpy-ndarray postgresql curl vagrant aspnet reporting dataframe nullpointerexception javafx tensorflow jsp jpa wordpress servlet indentation xcode scanner sum github deployment socketexception git certificate css util oracle cmd recursion search tcp syntax connection float httpwebrequest dictionary database main function number file html5 functional-dependencies mysql-error-1064 node-sass gulp-sass archlinux sass data-science spring-batch psycopg2 backpropagation texture2d async-await net-mvc-4 google-calendar pdo ruby-on-rails-4 ruby-on-rails dbmigrate wp-mail github-for-windows facebook-android-sdk jsx xcb microsoft-metro react-router spring-data-jpa softmax elastic-beanstalk google-sheets-api google-spreadsheet-api cnf minidump xuggler acumatica gitlab elasticsearch-5 elasticsearch app-transport-security ios10 ios9 cdn cloudflare wampserver plpgsql aptitude salt-stack sysv parent-child android-edittext textview visual-studio-2012 android-workmanager unique-index unity2d electron theano maven-compiler-plugin pickle assetbundle rstudio bar-chart python-tesseract tomcat6 primefaces solr easymock jvm-crash tomcat7 apache-httpclient-4 web-services woodstox discord jersey-2 prepared-statement resultset sencha-cmd sencha-touch-2 sencha-touch powermock fips apex x509certificate ibm-bpm websphere-7 file-permissions file-io accesscontrolexception grizzly atmosphere slick2d lwjgl informetica twitter-bootstrap-3 jax-rs resteasy spring-security-kerberos spring-security kerberos long-integer mapnik git-review vpython continuation homebrew xgboost android-asynctask stack-trace user-interface jaxws-maven-plugin maven-3 maven-2 browser-notsupportedexception google-maps-api-3 google-maps visual-studio-code truststore firebase-authentication aws-organizations php-7 string-formatting cplusplus visual-studio-2015 net-core net-mvc msbuild extension-methods foreign-keys windows-services react-redux inputstream facebook-graph-api entity-framework-4 reportingservices-2005 mips linear-regression deep-learning block-device melt reshape2 floating-point webpack-dev-server javascrip 0-lollipop android-5 statsmodels avx eclipseide javafx-2 php-not-recognized command-line phpmyadmin tinyurl classnotfoundexception atom-editor android-emulator android-sdk-tools ionic2 cordova foobar2k tcplistener net-2 net-4 ole-db-provider export-to-excel openrowset windows-10 vagrantfile jvm-arguments global-variables autopostback classformatexception eof upstart ipv4 graphviz pydot atom curses unityeditor pandoc tesseract xls oledb redux spotify cxf wamp nio stacked headless typescript hyperlink outlook jackson keystore applicationcontext pipeline iis jstl encryption perfect-square objective-c carthage xcode8 indexing standard-deviation apt sudo xampp connector apple mdf kubernetes destructor gettime arithmeticexception gmail mix probability ioexception heuristics milliseconds disk angular cpu npm modx-revolution goldsky modx prevnext javascript-dictionary stack-smashing device-monitor radio-button android-actionbaractivity android-activity android-fragments java-long unqualified-id ora-12154 javc c++ java nullpointerexception runtime-error drjava awt-eventqueue dsx math-pow ajquery nosuchelementexception appcompatactivity jsf jtextfield awt jpanel thymeleaf inputmismatchexception glassfish deque anaconda jupyter flask lvalue gradle servlets netbeans simulation ibm factorial javax apache arduino boot opengl virtualbox jvm margins 2147483647 mongodb cloud firebase plot plugins processor automation crash card repository installer pointers editor concatenation formatting debugging devices module testing color arraylist sequence nested date caching expected build response ip directory algorithms release collections print figure casting url expression validation integer microsoft sorting sort read logging types replace rest 4 email ajax exe excel message required text variable time size dll system files table runtime static random code web map http error version 0 2 my
+2 votes
32 views
I am facing the following error message after comiple -

undefined reference for one of the class methods in c++. The error is "undefined reference to .....

int test();

int main() {

    test();

}

a.cpp:(.text+0xc): undefined reference to 'test()'

error: ld returned 1 exit status
by (1.5k points)  
reshown by

6 Answers

0 votes
I throw up a bunch of lines from the c++ standard and and basically the key part here is undefined behavior is behavior for which this international standard imposes no requirements so no requirements that's the key there's there's no limitations on what could happen examples of what could happen  ay your program could crash  ay my feeling is that that this is one of the best things that could happen  ay um your program could give unexpected results I'm not going to say wrong answers because there are no wrong answers in undefined behavior  ay there's any particular result is every bit as correct as any other result so giving unexpected results you know not wrong answers but expecially your computer can catch fire um your cat could get pregnant this would be really disconcerting for me because I don't have a cat so if my cat gets pregnant that's big news the sea committee has this ongoing j e which it said basically started off with undefined behavior could cause demons to fly out of my nose and now that now they've formalized it to inv ing the nasal demons and frankly the scariest one if the result of undefined behavior is this one your program could appear to work fine and it can work fine it gives answers that you expect it does what you expect on this run on the next run on you run on Monday and on Tuesday it does something different yes or when you when you upgrade to a new version of the compiler right or you decide that you're going to optimize for size instead of performance and suddenly it gives you the different answers anyway the key is to remember about undefined behavior is no wrong answers there are no wrong answers with undefined behavior there's no incorrect behavior for you perfect and we'll start off with a simple example  ay we have an array right we have an int we have an expression here  ay we're modifying the variable I twice within the same expression I mean we can talk about sequence points and things like that but this is undefined behavior modifying a variable twice in a single expression one should get printed here well that the answer is there's no wrong answers here printing nothing is perfectly reasonable free formatting your hard disk more prosaically GCC will print ten clang will print nine under some circumstances  ay I didn't try you know burying the optimization level or trying different versions of these and so on but you can see ya neither of these are wrong and somebody has actually tried this with icc start with icc Prince 11 and the thing is is that it all depends on the order of these operations and addition is commutative right yes neither is neither is wrong right exactly there are no wrong answers right yeah none of neither right i mean this this this is you know syntactically this is a well-formed program but semantically this is a program that exhibits undefined behavior and so you you you have no guarantees about the result I saw your hand first student yourself so I wonder whether things nothing is actually possible because the undefined behavior is of course in the evaluation of the addition but the expression still has to be some kind of um no but if you hang on to that I have I think I have some compelling examples later to see why that why that's not a correct assumption undefined behavior means means undefined behavior there isn't dart you know back here no requirements imposes no requirements there's nothing that says that that this program actually has to terminate it has to because it exhibits undefined behavior I'll it doesn't have to terminate it doesn't have to print anything it doesn't have to do anything yeah um this example I believe that a static analysis tool or a compiler could in fact check but many examples sorry the question was could a compiler or static analysis tool a fine undefined behavior and flag it and this example yes certainly could I mean we can lo  at this and inspect this and say yeah it really doesn't matter what I starts out as this is going to be undefined behavior pretty much every time you you run it I'll show you some examples in a little while where that's really not true there was another hand over here I saw yes a reference for here sandy the whole the undefined behavior applies to like the whole programs undefined even above the point where you have a behavior right the comment was that the undefined behavior does does not actually just apply to this expression it applies to the program as a whole the program exhibits undefined behavior and that's correct any other comments on this yes editor's I have this huge program and I've got one line of undefined behavior in one dll and debery dad in the code that that means that the program is under 5 is an entirely um if if that routine gets called if that code gets executed then the behavior of the entire program is is not  ay the question the question was is if if you have a routine that has some undefined behavior in it buried in some dll that you call does that mean that your entire program is undefined and the answer is if you call that routine as part of the execution flow of your program then the result of running your program that the behavior of your program is undefined yeah  ay yes could we say that until the moment that you encounter an expression or an evaluation which produces undefined behavior until that moment your program is in well-defined territory  ay question is can we say that until your program it executes it encounters a piece here that exhibits undefined behavior that the behavior of a program is undefined sadly no if you'll hang on to that I'll show you why baby 10 slides in it has to do with compiler optimizations  ay so how can you get undefined behavior  ay a whole laundry list you're actually got three slides with signed integer overflow is probably the most common interestingly enough unsigned integers there's a special call out in in the standard that says that unsigned integers basically have to act like two's complement ensures they they wrap modulo the maximum unsigned value but unsigned integers this is not true and a lot of this is because see and this came from see I'm see when on systems see runs on systems where sign in as your overflow could cause a trap  ay the standard is not going to say anything about trap instructions or anything like that um standard doesn't say anything about memory protection either so that you know that is behavior outside the standard that's behavior that the standard places no restrictions on signed integer overflow see i have several examples of signed integer overflow because it's probably the most common um dereferencing a null pointer or the result of mallet 0 you know if you call malloc with the size of 0 you get back a valid pointer  ay you can test to see if the allocation succeeded you can pass it to free you can pass it to realloc to make it bigger but you can't interact through it because there's nothing there um it most modern machines have Miriam section if you try to dereference a null pointer what happens you get a bus error a protection violation or something and your program is now lying on the floor with a bullet in his head um that's behavior outside the scope of the standard that's a very helpful behavior but that's something that the standard doesn't specify and that counts is undefined behavior that's that's a nice thing that happens remember I said example program crashes it's a really good thing if you're running on I used to write mac software in pre Mac OS 10 days and there was no Mary protection so people read from location 0 it was just a pointer then and there was data there it was almost always a bug but it didn't actually well you read from locations here you got a value it wasn't what you expected yes 0 any different from say male like 10 and then you reference beyond the end of your it's really the same thing if the question was is Malik 0 anything it and dereferencing it actually different from say now lo ing 10 and in allocating the tenth element of that no not really um but it's an example of a pointer you just can't dereference at all shift shift left shift right by by an amount that's greater than or equal to the width of the operand if you have a 32-bit integer and you shift left 32 bits you don't get zero you get undefined behavior I mean a lot of hardware will give you a zero let's just say all those bits they get thrown off the end and get replaced by zeros one of the really hard things about undefined behavior especially if you've been programming for a while is you have this mental model in your head of how the hardware works right signed in is your overall you to add two big numbers you get a number back because that's how the add instruction on say x86 works right you have to let that go   because because C has this scene sorry C++ has this idea of how this kind of abstract machine works and then maps that to whatever you're running on and it's a it's it's not a big machine it's not oh it's not a whole virtual machine like Java or anything but you hear talk about the C++ memory model and so on and and that's you know that's the context in which the compiler generates code for and then maps it to the underlying hardware and some of the things like signed integer overflow they you know the result of adding two numbers that would cause an overflow for integers vary from actual physical
by (8.9k points)  
0 votes
hardware to physical hardware and so in the c++ abstract machine this is an undefined behavior um reading from uninitialized variables just I mean you know undefined behavior  ay if you're lucky you'll just get random values but we'll talk about compiler optimizations like modifying variable more than once an expression buffer overflow reading or writing past the end of a buffer  ay that's different from reading or writing uninitialized variables uninitialized memory because you know if you run off the end of a global whopper and you read it right there that may not that's probably not undone initialize it's just a different variable um comparing pointers into two different data structures um yeah you say what I can compare pointers you know it's just a compare instruction again you know comparing pointers doesn't really make any sense if there are different data structures  ay you have an array you can compare something in the array to the start of the array or one past the end actually is a special call out for one past the end just like iterators in you know in fact you can you can form an iterator one past and you can't dereference it but you can you can manipulate it on you can't compare iterators from two different data structures right same thing with pointers um pointer overflow if you increment a pointer to the point where its underlying representation overflows modifying a constant object in C++ and or or a string literal because string literals are cons car sir won't compile 105 the cons oh my god i mean if i use cause i don't want to have undefined behavior i wanted to warn me when I modified um and and if you do it directly so the question is the comment Ed's comment was if I have a constant object and I modify it the compiler complains about and the ads that's absolutely true   if you do it directly but suppose I have a pointer to a con I have a const foo and somehow i end up with a pointer to a non construe that points at this thing now i can modify it and the compiler has no way to know question in the back um so the question is it is it impossible then to check and see if a pointer is within the bounds of a CRA um I John John said yeah well it certainly you can check you can write code that checks but I think John's right that if if it's not in the bounds you've been inv ed undefined behavior question in the back for the foyer se les is defined to be is defined well except that except that what the question is could you use STD less and you can use s to t less but it won't tell you what he wants which is with it is it within within a data structure because it's um it does just tell you which point pointer is greater you could do that you could you can compare that way um hmm I have to think about that I have to think about that yes uh-huh I got employers back from those right they're not in the same data store right am I allowed to order those by comparison where is that undefined I'm what I would expect is that the ordering would vary from machine to machine and run to run  ay the question was it's suppose I have a collection of pointers that I've got by multiple calls to Malik am I allowed to order those pointers and and it is obviously they're going to vary that the pointers are going to the order is going to vary from machine to machine from run to run from what else is going on in your program at any given time because basically the the sequence that you get back from Malik is non-deterministic mm-hmm you know what I I'll go back and lo  into this some more  ay what you can't you can't compare them you can't order them  ay michael michael has said that that you can't do this  ay with greater than or less than Jonna said the STD less with this i'm going to say that i'm going to go lo  at this more ciao um   chi hung says that that this this is this ties into strict aliasing rules and that if you have two pointers to untreated online similarly align types cup compatible size types  ay  ay wallets let's go on with this because and and if we have time we'll come back to it and we can discuss it yes Const is a new thing because every new thing in C++ indefinitely because I remember some text from have SATA bought the kampala isn't allowed to use constants to optimize um and what had I'd be wishing for the composite to use constantly the question is is is this modifying a cost object a a new thing in C++ 11 because of a talk that her mother gave about whether or not the compiler could use const optimized I don't believe this is a new thing in C++ 11 my take from herds talk was that in C++ 11 because of the introduction of multitasking multi you know multithreading sorry on that they the cut the what's a good way to put it the semantics of passing a value by const reference to a routine have changed it used to say the routine can assume can he's not going to change this and it's not ever it's not going to change for life of the function now it means the regime is not going to change this and that defeats a bunch of optimizations because we could have another thread running at the same time that could be changing this object that i'm holding a const reference to  ay and so I can't assume in my routine that if I call say if I have a constant reference to a vector that I call size at the beginning and then I call sighs later I get the same answer back which I could do in C++ 03 John I saw your hand it Charlie about moms path into a Const cast on somebody that was passed to you the constant pointer what was initially not a constant object mm-hmm that's that's why you can go to town on that but if you can't cast something that was originally declared as a khan subject because the compiler might put that in wrong  ay and so that's why they're saying it's completely undefined behavior what you do what happens if the object sister that you're doing the contrast on was originally declare its cost object yes was everybody able to hear that or do I need to repeat it  ay yes yeah the cost cat a comp cast does not necessarily as John said does not necessarily cause undefined behavior it merely it's if if the original object that you had say a cost point or two or a concept to was not cost if it was originally cost as John said it could be in ROM and you know or or in a riedle you know in a section of memory that has been marked read only by your memory protection system  ay  ay yeah he's do that whoops excuse me got it so what he said was if you even if if your if your original object is const you can still cast the constants away with a cons cast that's that's defined behavior but then if you modify it then it's undefined behavior  ay more fun fun one negating int min  ay uh negating right you take int min- let's say for 16 minutes minus 32 768 and you negate it what do you get you don't get 32 768 do you um you get undefined behavior on data races this is a huge catch all in the standard  ay and sadly there are there are not good tools for finding this I had somebody come up to me yesterday after the sessions were over who had bent who in one of one of the sessions in the afternoon that I was not in had become convinced of something that I had said to him any several months ago and he said you right and he and the comet I made is there are no benign data races it just aren't mismatch between new and delete when you when you call the second when you when you delete something you need to call the delete that matches the new you called yes stuff essentially that occurs at compile time or you know that may not be detectable is that at that announces its do but data races still seems separate like it's something that may occur at runtime right well so you certainly did the question is that a lot of these seem to be things that can happen it can be detected or happened at compile time yes but data race is obviously not so much detectable at compile time and some of these yeah let me get down here also things that you can't necessarily detect at compile time there are some things that you can detect undefined behavior detective compile time and those are actually the scary ones because as I get on I'll show you I'll show you about smart compilers but but yeah some of these you cannot detect except it runs and then related that invention I mean you might get to this so we can put awesome but imagine that it if it's on but online for a year then the it doesn't matter if it's up till that point its programs just on the buying house right it's a case where it's something that that occurs at one time the question is so if it's composites something that the compiler can detect it compile time on then you know every execution of the program exhibits undefined behavior if it's something that can be detected at runtime is this something where you get good behavior up to that point and I want you to hang on to that because I have some slides coming that shows show you well not really yes I think with the last one then copy with overlapping buffers you have to be able to guarantee a tiny code the program that that cannot happen because if you try to compare the pointers um interesting comment if mem copy with overlapping buffers if you you have to guarantee this at basically at compile time because with over if you otherwise you're comparing pointers into different data structures um an interesting idea on specifically the C standard libraries physically has mem move that is designed to handle overlapping buffers yes  ay  ay so Shahan says if they it this is again tied into the strict aliasing rules and if you have two pointers which can on one type can alias and other end and bites bites it can pinpoint pretty much alias everything so that's that's how you get around this last one are the one up above mem copy calling us a standard library routine a library routine without fulfilling its prerequisites the the standard if you read the c++ standard it says you know there's always a bunch of things in there that you know library calls it's a thousand pages of library calls right and they all have a little thing that well I should say all of them many of them have a thing that says you know requires  ay those are preconditions if you don't fuss boy little preconditions you have no guarantees on what that library call is going to do i do it aside for that um  ay here's an example yikes um you have food that does something all right  ay doesn't really matter what it does uh we we call new foo and then later we delete it um so just think about this think about  ay we do this we run for constructors how many destructors we get run here on my system on my system I get a really nice behavior my program crashes with an error message it says attempting to free a block that wasn't mallet that's a really really good answer yes hahahaha ding ding ding so the comment is none of them because he is an in point or not a food pointer ding ding ding um but modulo bugs in my slides I want to apologize for the slides these slides yeah the slides are that the code on the slides that I I tried to to present concepts um no this was this is his typo   this this what this was not done dling and so they tend to be shrunk down to fit on slides if you you know if you have I make these slides available if you have trouble with the code on the slides contact me and I will give you bigger programs   atomic lock is free you pat it passes a pointer to a shared pointer this is very weird interface but   requires p shall not be null if you pass atomic is locked free to and you pass null there no guarantees on what what kind of behavior you're going to get  ay arithmetic operations whoo-hoo Bob lava this is I think the last bit of the last wad of standard ease in the presentation  ay um if the math results not mathematically defined or not in the range of representable values for its type
by (8.9k points)  
0 votes
the behavior is undefined so if you're working with you int 8t and you had 200 plus 200 not excuse me not you end a tea into 8t and you add sorry you don't even go to 255 Dewey you add 100 plus 100 the result is undefined because you can't get to 200  ay um yeah notes here right ba-ba-ba-ba-bom most exhibit with implementations ignore energy overflows but it's not actually required what we make use of energy flows um there's a there's  ay let me let me back up there is a specific call out in the in this section immediately after this that says that unsigned integers don't behave this way they they behave as if the arithmetic was done modulo un max so that's not undefined behavior  ay but signed integers yeah where in lb I'm note that ignoring integer overflow is we if we can prove that it's sign under over closed and we assume that that can't happen yeah I'll get to that   he said that what well no most exists within implementations ignore it is over that's not wrong most is the weasel word here llvm as michael has pointed out actually and i have i have slides about that because that's really that the one of the scary things about undefined behavior as that compilers are getting smarter about undefined behavior and they're using this as as part of their code generation strategy  ay no wrong answers does this print true does this print false does this print true followed by false false followed by true does it print nothing at all does it do does it inv e the navels nasal demons actually yes it inv es the navel Venus um the answer is you have no idea I mean it can do a different thing every time you run the program and you know from if you have a model of the of the machine that the low-level machine in your head you will say yeah sometimes I read this and I'll get it true and sometimes I'll really get it false just whatever happens to be memory but that's not a good way of lo ing at some compilers some compilers will lo  at this and just say oh the hell with it clang if you crank up the optimization level will actually just say false it won't allocate a variable it won't do a test it's just a false are you can lo  at the code and it just it actually even lo s at this and says this isn't even a format string I can just call puts and it generates called puts p UTS false new line and that's the whole that's all the co2 generates yes Mike why don't we just fix the standard and always initialize variables why don't we fix the standard and have it always initialized variables because one of one of the one of the things that that that's the philosophy that C++ got from see was you don't pay for what you don't use and so initializing variables that you might then door ahead and write right over that's that's work that that doesn't actually do anything for your program it doesn't make your program better and it slows it down so I suspect that not speaking for the committee speaking personally I suspect that would be a non-starter yes like this the standard does yeah it says reading from uninitialized variables it's it's good um compilers are getting much smarter static analysis tools are really good at finding this I i have used several static analysis tools I mean said question is why can't the compiler complain about this and some compilers will  ay and some static analysis tools will will tell you this and the first time you run a static analysis tool and you have like a hundred line 100 line routine and it says way down here at the bottom it says you're reading from an uninitialized variable right here you lo  at you say what it says you went down here I to  the true branch on this if and then I to  the false branch from this if and then I to  the true branch on this if and I got to here and nowhere in this flow of control was this variable set and you say dang I like that um it's really nice when that happens um I would expect to see more of this in compilers in the future because compilers have more resources they're getting smarter yes the compiler certainly the compiler can't always warn you some because cannot always yes um but sometimes like this the compiler could I saw a question down here yes  ay the comment is is that 15 years ago the argument about efficiency um was was probably much more valid than it is today because well machines are bigger and faster and this this cause it what's this causes a fair amount of problems and it it may cause more more problems than then the gains in efficiency outweigh but you have to remember that C++ and C run on a wide variety of machines I mean one of the most popular machines these days were just hacking around on is a Raspberry Pi which is not a big machine and I see people work writing code to go on where's Michael Michael Cassie here he was writing C++ code and targeting a machine with like 4k of memory a couple years ago and so it used to be that these were small devices it's not true anymore but there's there's always small devices coming in at the bottom of about John where we had sewn in between really great guy did this optimization he squeezed out more percent of performance big win but the company was working on was really scary we didn't actually find any leaks but in a lot of naked pointers so we get this reformat so that we use classes and did some stuff and cleaned it up and suddenly it's optimization just disappear it sounds like what happened we split that and we lo  up it's because when we introduced the classes we also initialized everything and part of the optimization was that we had peonies you know large amounts of of index stuff wasn't used yet and didn't have any setting at all and we build it in everywhere right so you didn't pay for what you didn't use it could everybody here tom and once you have a policy that says the compiler always initializes you can't do that here and then it just mmm could you go ahead and repeat what John said it home for the for the recording but he I the guy guy doing her car he could hear it I checked part of what is that people should crank those warnings up to the highest possible value and have warnings as errors said you know if you're if you want no everyone's gotta make their own policies right but but the way you you get the compiler to detect this stuff which is undefined behavior because it's undefined behavior the compiler has to let it through instill compiles all I can do is generate warnings so so the comment was that that the suggestion is this is to crank the warning level on your compiler way up compile warnings as errors and and hope that the compiler can catch some of these and and in general that's a policy that that I favor it's not always practical especially when you're dealing with third-party libraries and so on this you don't have a lot of control over I have a couple slides at the end of it as end of the talk about other tools that you can use to detect undefined behavior as well ed and then Michael I think in the battery actually I think I my comment you covered it in my company all the developers use debug build we use visual studio we debug bills and we get the thing working working working right and then we release it as a release build and then just last week I had an issue with an uninitialized variable and popping the level of the optimization sub it just expose all of a sudden the behaviors flipping back before right actually another one yeah that's the thing is so said said this is he went from a debug build to a release build and you know up the level of optimization that the behavior changed because he had an uninitialized boolean on somewhere and yes that's that's exactly the kind of things that you can see with on with undefined behavior the really insidious ones is are you get a new version of the compiler and you build with the new version of the compiler and your program behaves differently and you say Lassa Fras or about stupid compiler vendors can't even ship a compiler that works Michael did you have your hand up back there or were you just stretching  ay my feeling is no default and because some people might mention performance but also you don't know no correct choices on the right be false here here might be simple and obvious but when you can when you're taking some other day tights you don't really know what is it one is a general they're all potentially valid and no matter what you takin control will cover or contradict with somebody surface of this choice and then finally most compilers Michael pilot does have this well I can inject what I can inject a compile time I did a bit pattern that I'm lo ing for on thought so that I can definitively know that that is that is really funny undefined her initial price very as opposed to one died accidentally initial i incorrectly  ay so Michaels Michaels comment was that in this case it's probably reasonably easy to just figure out what what a what you can default initialize this too but in this you still have to choose true or false but if the compiler does it you know it have to be have to have a value default value for pretty much every type he also mentioned that many compilers have the ability to initialize memory to some some set pattern I think that you know like deadbeef or something like that or CD do you know anyway um there's that and that is easy to inspect to say oh well lo  that pattern I didn't write that pattern there that's that's an uninitialized data pattern yes going beyond even if you knew what the default should be for every type if you write code like this and the compiler gets rid of your own behavior by defaulting it is still big oh it's still bad yes you're in a worse situation well I'm so yeah so it kind of was you know if if the compiler defaults this to something basically it's got really undefined behavior by by choosing a behavior for you which may or may not be what you want is that a better way to play and it is still bad code  ay um about one more question and we'll move on because I've got another 20 slides and we're 45 minutes into a talk yes other thing of a similar nature why is undefined rather than specified undefined because you're you're reading from an uninitialized very bad game  ay why is why standard why does that one standard say that buzz rates are undefined he seems rational to expect that it would be really unspecified period something special um hang on to that lo  it has to do with compiler engineers  ay ding why do we do this  ay um gives the compiler leeway gives a compiler of freedom to choose how to generate code and um by by assuming that there's no undefined behavior in your program the compiler can generate simpler faster smaller code um this is the same kind of thing that that we inherited from see about array indexing right you say bracket free there's no check that the code there's the compiler doesn't generate checks to make sure the array has three or more elements in it more than three elements and excuse me it just says Oh third element you know why is this important compilers know about know about language semantics and they take advantage of it the standard basically by placing no requirements on the on the behavior of programs that contain undefined behavior it's perfectly legal for a compiler to transform a program that exhibits undefined behavior in any other program because there are no wrong answers um and why is this important to compiler writers how many peopl
by (8.9k points)  
0 votes
here work on code generation for compilers yeah you Gabi so Kathy and Michael um you know you work on the backend of GCC and you come and you say lo  I sped up this set of benchmarks by 1% it everybody says really that's yesterday and they say that's wonderful that's super what have you done for me today compiler write compile code generator people they live and die on performance and code size and it's this never-ending slug what have you done for me yeah that was yesterday what have you done for me today um you know my company we make firmware you know chips and firmware that going phones and one of the things that that we have is we have a space budget in the lung for you know the code that runs the broadband chips in your phone it's X number of bytes actually these days it's X number of megabytes but doesn't really matter um so that's how much space there is in the rom for this code  ay somebody says well I need this new feature  ay well um we're already a budget suites budget so do you want to take something else out no   we can refactor the code to make it you know to make it more efficient space wise but the other thing they do is they go the component vendors the compiler writers and say can you can you hike generally smaller code please and there's this constant pressure for smaller or faster or usually both my preference and this is one of the ways that the component that compilers get that size and speed advantage and you know GCC llvm you know there are people every day out there running the same benchmark on these two and lo ing at the difference and making choices as to which compiler use based on the results of these benchmarks oh lo  this test l of m is seven percent faster than GCC wow what are those GCC people doing and there's laggers all on this test GCC is twelve percent faster then then llvm and this is really important to people who choose which compiler to use and people who write compilers because well they want their compilers to get used um you've got to remember that the compiler is an expert in the language semantics   you think you know that they don't think people who write the compilers they have embedded that that the knowledge of the language semantics into their code generation into the crowd so program getting you trust that whatever was smart speed Charlie my employer will take anything that is you know you know if you more than 2% know it's going   so I Gabi's comment was you know in the past the assumption was that the programmer knew what they were doing that now that the programmer was was what's competent was was expressing himself in terms of code correctly he also commented that there's that there's a threshold for for improvements you know little tiny improvements tend not to get in vent and people lo  for big improvements you know less than 2% he said yeah that's not worth doing although I the people i work with it's like you know a percent and a half did send me not that important but when you put four of them together suddenly you're at like seven percent and all of a sudden everybody says mmm so i saw another hand back here no  ay um anyway um John reger the University of Utah has been doing a lot of work about undefined Vanaras specifically about integer stuff but a lot about under planning behavior and he has written a package called the integer overflow checker um but he has come up with this taxonomy of undefined behavior for talking about routines I'm type 1 no matter what your inputs no undefined it ever this is the kind of routines we should all strive to write type 3 undefined behavior every time no matter with the inputs here's an example  ay frankly these are uninteresting these people don't write these very much and if they do they thin to fix them pretty quickly because they find that they're unreliable type 2 it gives you undefined behavior for some subset of all possible inputs these are the most common ones  ay these are the ones that people write by excellent um here's an example  ay this is taken from some code i reviewed at work several years ago the names have been changed to protect the guilty um takes it in as your pointer it logs this to say how we got called we see if it's null if it's no then we malakut do some stuff and return it  ay um so this is a type 2 routine if if the pointer that is passed in is not null there's no one to find behavior here  ay but if the point of this past in is no this call to this log routine will indirect it and thus inv e into undefined behavior this was back this this was very common throughout the code base i was lo ing at because it this was an old code that ran on phones many years ago that didn't have memory protection and so in directing from null you know there was no memory protection so we didn't get trapped or anything you just read from location 0 you got a value it wasn't the value you expected but you got a value so um so let's think about this from the point of view of a compiler so if if p is not null this branch never gets taken does it if if he is no I got undefined behavior here there are no wrong answers Oh do I really need to generate this code really um no I don't and GCC at 02 and above will not generate this code will will basically elide this entire block of code it will not appear in your object file anywhere because of the call to log yes exactly son cc'd is this microsoft visual c++ does as clang does this lots of compilers do this  ay this it just showed up for the first time I think in like GCC for two or something   this is not a new optimization  ay you're a compiler vendor right people people say you know what have you done for me today you say lo  smaller code faster code runs faster it's every bit as correct as as as it would be if I generated this code  ay I've taken I've taken a program that exhibits undefined behavior in some circumstances and change the behavior of the program in those circumstances in the cases where the this there was no undefined behavior here the program behaves identically it's only in the case where there was undefined behavior this program behaves differently that's a perfectly legal transformation any questions about this comment is especially fun when that long statement started by a macro yes that's also fun and you know the history of this code of course is that when this was originally written the log wasn't there and there was no undefined behavior and then some time later the proverbial you know the mythical intern who is the cause of all of everybody's problems in every code base right was given a task to him to add logging to to this code base and they went through and did this and actually in the original code this was you know if def cutie bug or something like that right so in debug builds undefined behavior in non debug builds its fine but you know what the thing is is the code generator was completely different and that makes it really hard to test that would not be undefined as long as you don't indirect this and in the case that it's dull then it's not in fact the question was if replace this with like some ternary operator it says p equals know ? on 0 colon star p no problem uh-huh check this one under my behavior if one branch of the turf I mean you yes it's a question was is the compiler actually smart enough to detect that this is undefined behavior in the case where this this is know when one one one set of one branch of the potential execution paths and the answer is yes and it's not just the compiler it's pretty much every compiler these days it's smart enough to do that Gabi yeah search theory is that the corner is not not so you could just as well output asserts the not equal yeah so what Gaby said was that that the compiler is not actually generating a whole set of possible different execution paths but its reasoning about this code and it lo s at this code and it says oh um I'm in directing p here so what do I know I know thank you I know that either p is not null or I have undefined behavior  ay if I learn to find me havior I don't really care  ay correctness is is easy I can do pretty much anything so I'm just going to believe that p is not null from now on I get I mark in my in my little knowledge base that he is not know from here on and then I say oh well this is a test for P is no this always evaluates to false this this gets changed if false and then this is unreachable code and I can delete it is that a good description  ay another example  ay you can take this is this is a complete one you can take this back and try it on your computer  ay this is all see but that's  ay um yes you do C C++ anyway um start with the value and then either time through the loop we add this value to itself and we print what the value is and we do this as long as I is greater than zero um if you run this with GCC it will print out this number then then to whatever then for whatever then ate whatever and exit except it will print won't put the hex values it will print decimal values and you say that's great and then try it with um with like 02 and it will print 101 to data for data 80 000 000 forever it will print it'll printer as long as you're willing to let it run so what has happened here well the optimizers lo ed at this code and it's done some reasoning about it it says hmm that's a positive number it's a positive sign number every time I to lose I'm adding a positive number to itself that's going to be a positive number or undefined behavior and well I don't really care and then at the end of the loop I'm testing to see if it's a positive number but you know what I just proved my satisfaction that's always a positive number so I don't really have to actually test this thing I just go back and do it again and yes it does um because you know I'd Oh sign in as your overflow is undefined behavior of some machines you know when this overflows you know they issue a hardware trap and your program gets killed which is also undefined behavior any questions about this  ay why do we care about undefined behavior because it's surprisingly easy to write code that has undefined heat this link here is to a talk about a discussion about a particular bug in the Google portable Native Client runtime where somebody did they what they thought was a simple refactoring of some code and in and introduce some undefined behavior introduced a shift left of 32 into a of an int of a 32-bit integer and basically this this defeated a security test in the security check in the portable Native Client runtime oops on undefined be here code exhibit undefined behavior may work for a while may work appear to work just fine for a while and then suddenly stopped working break when yeah when you change the optimization level when you go from a debug build to a release build when you switch to a new version of kyler god those llvm   they're so stupid i went from 33 to 34 and all my code br e what if they done or GCC 49 or whatever or Visual Studio 2010 2012 or 13 um this is you know it's really easy to shoot the messenger here   you get a new compiler nothing has changed in your code you just built with a new compiler and now it doesn't work it's got to be the compiler right not all is   I know wats people who work on compilers there's people in this room who work on compilers they'd be the last to tell you that compilers have no bugs that's just silly but an awful lot of the time it's not the compiler this is what the stack people call optimization unstable code this is code that behaves differently at different
by (8.9k points)  
0 votes
optimization levels different compilers and so on and you got to remember the known in the party here there's no wrong answers the codes not incorrect I mean the code is not behaving incorrectly it's giving you different answers they may not be the answers you want but it's because the undefined your behavior you're seeing the results of undefined behavior this is the first time I mentioned the staff people i'll mention them some more undefined behavior shows up in tricky code  ay frequently code that's trying to do security checks and things like that this is a fascinating discussion on this is an old oh this is like a nine-year-old bug report in GCC it's just absolutely amazing it was basically an opposition that was introduced into a really old version of GCC where where it did something similar to this  ay it optimized out tests when it when it could prove that either that the reason behavior was undefined or the result was always true people got seriously bent out of shape one of the ones was like I demand that you revert this immediately people are going to die because you optimize way my security checks I don't care what the standard says you can't do this um and it goes on and on and on I mean you could it's it's it's a very entertaining half hour I will make the slides available  ay these slides available on you know that the C++ now github repo so I mean you're welcome to write this down but if you don't you know you can check this out later on stack let's talk about stack stack is a grad school grad student project out at MIT their goal is to find what they call optimization unstable code code that behaves differently under different levels of optimization usually because it contains undefined behavior and what they do interesting it's based on on llvm and what they do is basically they they compile code at at low optimization and a high optimization and then and it's it's a and then examine the internal data structures of the llvm back-end to say hey all chunk of code just disappeared here why did that happen why did the optimizer decide that it could just get rid of this and then they go lo  and lo ing for up to undefined code undefined behavior excuse me and they want a paper i have a link at the end to this paper it's a very fascinating paper both because of the technology behind it how they went did it but also the reactions to people that they talked to they they they ran this against a bunch of things a bunch of code in Postgres and the postgres developers were had a mix of reactions some of their reactions were ooh yeah wow that's it that's a nasty bug thank you for pointing to our attention we'll fix that and some of them were no that's the compiler being stupid  ay blaming the choir is very very satisfying but you know what you still have to ship the output of the compiler to users and and saying the compiler is being stupid doesn't really help you help your users anyway um  ay texts are hard to write this is an example from the Apple secure coding guidelines from a couple months ago Bruce Dawson was all over this basically  ay fine we you know this is actually the second revision of this  ay um multiply two numbers together it doesn't actually say here but I'm pretty sure they're int multiply two numbers together and then we're going to check and see if n is greater than 0 M is great and 0 at sighs Maxie Saturday's n is greater than M then then we can allocate this space  ay the problem with this is it's too late you know the test here the the integer overflow the undefined behavior it's already happened but but you multiplying two int's the fact that you're assigning it to a size to you is a red herring it's it doesn't matter it doesn't change the operation if you have to you if you if you cast these two size T then there's no undefined behavior here yeah the fact that this assigning to a size t is is irrelevant because you're you're multiplying two integers here and I the way I see I I talk about this is kind of on we all we all know about the dangers of stir copy right you you can't really use stir copy in a safe way you have to pretty much pre-flight every single time to see if your buffer is big enough to hold the source  ay you cannot call stir a copy and then check to see if the buffer overflow has occurred it doesn't really work that way it's too late this is a similar circumstance you can't actually do something and check then check to see if undefined behavior has happened because the damage has already been done in this case though the damage didn't happen at runtime that damage happened at compile time and the damage occurred the damage occurred in the compilers internal data structures the things that the compiler uses to reason about your code and to generate code  ay the compiler kind of gets can kind of assume that this is true after this multiplication because if it isn't true you know undefined behavior has happened this last test here um so this slide was my attempt to talk about aliasing and ji jaanz gonna giggle but that's  ay but basically so we have you know functions we have two unrelated types but you know who has pretty much the same layout as the start of bar and I see this a lot actually there's a bug in lib C++ about this and so we we create a foo and we create a bar pointer that points to 2f and we right through that pointer and then we try to print it out the compiler can reason about this and say you know bar bar a bar pointer you have a bar a point or read to write to a structure of type bar and a read from a structure of type foo they're not related classes one's not a subclass of each other the compiler can assume that this right doesn't affect F and it can reorder these   if that generates smaller or faster code and you get not the answer you were expecting yeah this could print three I have it I have a much better example in the lib C++ code base but it doesn't even come close to fitting on slide programmer that see style cast there is evil yes but but you know what reinterpret cast doesn't really help you here is it'll fail yes but if you do a dynamic cast dynamic s helps you here but there are people  ay that is it that is helping but the point is is that the compiler can reason that you know I i see i see a right to one of these and read to one of these and they're different types they're unrelated types and so it can assume that they're independent  ay taking different structure definitions and pretending that they over their memory and I like those poor man's Union yes it is on the comment is in see if if if the first member of balt whoops the first member of bar is actually a foo then it's legal and I believe that is sure it's even in C++ it's legal because you're you recruit not you know you could make a foo pointer here but any case um here let's some don't be this guy you know you think you're solid you think you're good and you're how bet that's cold anyway I I saw that I saw that a couple months ago and i kind of said oh that's funny and then I was wearing this talk it's like oh yeah that's that's a good way you know people this is how people think about their code sometimes and and and undefined behavior has this way of just carving everything out underneath and suddenly you know you're you're you're laying there he didn't even roll into the water so he got off lucky um yes oh he was first  ay I'm defying behavior on your program and I'm thinking like processors they'll do predictive lo  ahead i'll do some calculation and throw it all the way later and in your code you might want to start some calculation with numbers that may overflow may be fine and then throw it away because you've calculated that it's not on so the question is is it is there a reason to put undefined behavior in your program and and they talked about predict performance reasons but particular speculative execution I think and I think the answer is no and the answer is no is because um the compiler won't do what you expect it to do the compiler will make assumptions that basically say you know this undefined behavior could never happen and generate code based on those kind of assumptions that's on the end you know you know isolation where it's going to overflow but you're gonna throw it away because you start to calculations um I would be that so the comment was maybe you start to two calculations and one of them might earn is garbage and then you pick one of them I I would be very leery about that because um there's no wrong answers you know that the compiler will you remember this example right the compiler just elite 'add lighted a whole pile of your of code because it said you know if it's undefined behavior I don't need to actually do this if it's undefined behavior you know and you won't get the answers you expect if you're lucky you won't get the answers you expect if you're unlucky you will get the answers you expect you'll be good for a while but in general you don't really want undefined behavior in your code on you were next and then and then John let's risk right  ay so let me try to summarize this and you can tell me where I got it wrong you know the the big problem is you know about the undefined behavior is you you
by (8.9k points)  
0 votes
upgrade your compiler you you twitch to a different compiler and suddenly things stop working because you had undefined behavior and you want a way to actually find it more easily and eradicate it from your from your code base or maybe I'm misinterpreting what you said completely  ay nice dis our quality of invention of course but I am wondering if cool and diverse rado this kind of detection  ay so so wondering if there's there's ways to specify you know to to require that the compilers or even as a quality of employee qaulity of implementation thing have compilers warned about when they detect undefined behavior or worn or error  ay uh but but let's lo  at this this example go back to this example  ay if you never call this with no there's no undefined behavior here I mean this if you never pass null to this there's no one to find behavior here and there's nothing for the compiler I mean  ay the compiler could warn that that that this code may never get imputed um but so there's there's not any undefined behavior here it's potentially yes do we know if you if you if you don't call it with null you can remove this part of the function this from if P this if block but yes um but yeah um there are tools coming there are some tools available and there are other tools coming I'll talk about that mean that help you find undefined behavior anyway cool  ay it could if you don't call it there's no problem who is next John was next today  ay so I think earlier possibly using them to find the anger trigger some some behaviors and we'd like to see is really the trap of what I think this talk is about which is to say some some programmers are smart so they say well I know the standard doesn't actually guarantee this but I know what my platform will do in this case and and i'm just going to others all the way anyway or whatever the reason is that i know that doesn't guarantee what i might expect but I don't need that in this situation so its reward I think the whole trap is that compiler writers are now figuring out what's happening is you just given the compiler writer the license to generate any code at all there is no way that you can anticipate say well this is undefined behavior is actually benefiting somewhere actually the only the only case where you can do that is if you if you fix your platform which means a single compiler forever at a single optimization level and that may be a valid thing to do in the very short term but how many people out there still using GCC for 144 yeah yeah I know right here and then add and how are we doing here we got  ay we got I got like eight more slides and it's eight more minutes you don't have cpu or an eye on it for you there you have 8 10 16 however many paper that you do the same operation sometimes you know you'll use some part of it then you don't want to feel initialize the rest of it right any GPUs this is it this is probably a bigger impact when you win a really wide   it the comment was about that's it that days example maybe more more applicable for GPU programming where you have no many cores doing very similar stuff and you use some of the results but the problem is is that if the the pieces that that rely on to fight you know that ev e undefined behavior if you're going to you can't rely on those answers right so why are you even doing that work you're going to throw that work away because it's cheaper to do the work than to not do the work yeah I you know you're putting an awful lot of trust in that your compiler is not smart enough to figure that out that that they question what the comment was if you if you're doing five calculations but your your GPU is sixteen cores and it's just easy to do sixteen of them and then just take five and my comment is your you're putting a lot of trust in that your compiler is not smart enough to notice that there is undefined behavior envy and generate code based on that beam ability to reason and that's that's a disaster and their right to reason about a program in Dave Abrahams on others a smarter than I am a long a year pointed out that that's just a bit it's just a disaster when you can't make sense right inferences yeah so Beeman's comet is that undefined undefined behavior destroys your ability to reason about your program and that this is a really bad things that if you can't if you can't make inferences about how your program works you know you're you're in a in a in a world of hurt all right yes giving you a way to say pass apply maybe that says assume that you know please make integer overflow for example to find maybe unspecified bit but I'm very quickly I don't mind it is lower so there the question was are there compiler vendors that I've let you do things that let you know if with a flag or something that making is your overflow defined or you know give them give them a particular behavior and yes there are I can't remember what the flag is inclined there is one and there's one in GC what geez it's probably the same flag in both of them yes trap VI think in it rap rap v  ay um yes and you know some architectures like I mentioned they just issue a trap instruction and knock and kill your program dead Chandler so that Chandler's comment was if you use these flags then then be aware that you've you you put the compiler in a much less well tested state this is this is not something that that gets run all the time yes Getty programs that you want to keep growing  ay um I got a few more slides to run through and then we'll just do questions until um we have like two minutes what can you do about this um be aware be aware of undefined behavior if you're doing something tricky think about under five year is it am I inv ing the nasal demons here if you can build your compiler your code with several compilers and different optimization levels you can't check for undefined behavior after it's happened  ay I talked about this earlier if you write this code will this overflow in a a+ 100 less than a your compiler will will may depending on your settings spending how smart your compiler will optimize it down to that you should write something like this this is all defined behavior and it does what you want as opposed to return false tools tools are starting to appear um client has f sanitized equals undefined it's a compiler pass in a custom runtime it does not detect undefined behavior at compile time it detects it at run time you build stuff you build your program with these settings and then you run your test suite and it will flag undefined behavior as it happens this is this is the bomb this is such a wonderful thing John regar has an integer overflow checker program this is part of clangs for as well f F sanitized equals integer will warn you about these a last summer on this program called stack I have a link to the paper it's the the code is still very much a work in progress but if you're willing to baby it you can get some really good results out of it so quick quiz think like a compiler how would you optimize this code anybody want answers it would appear so so yeah you you in directed here you read from it here would you say  ay either it's know either it's not null or I'm an undefined behavior lanes fine I check to see if it's no but you know what Holly I already decided it's not null so we can get rid of this and then oh well lo  you know this this values in never used so I don't need to actually do that um Haley questions will references John Reger's blog lots of stuff here this is the stack paper right here user manual for clang and an undefined behavior sanitizer the llvm blog has a three-part thing about what every C programmers should know about undefined behavior more things from John rigor and a naca an ACC you presentation from last year about unspecified and value behavior this is a really interesting talk if you're interested in cogeneration because basically takes a bunch of code samples and then run and then compiles them with a bunch of different compilers and then it examines the object code   uh we have we are out of time um I will be quite happy to answer questions
by (8.9k points)  
...