|
|
Work in progressYou can always find the latest version of AngelScript in the SVN on SourceForge.net. There you can download a tar ball with the latest revision or browse the repository online.If you prefer to use an SVN client to download the code, point your client to the following address: https://angelscript.svn.sourceforge.net/svnroot/angelscript/trunk I recommend TortoiseSVN as the SVN client if you're using Windows, otherwise the original SVN is the best alternative. Version 2.19.0 WIP - 2010/03/14
Known bugs
Changes planned for later versionsThese are some of the changes that I'm thinking about. Not all of these may actually be implemented, some may be implemented a bit differently than described, but all describe my current thoughts. The list is ordered roughly by priority. The item I'll implement next depends mostly on the complexity on the feature, my current interest, and feedback from the user base. You're always welcome to send me your comments on current and/or upcoming features. version 3.0.0 requirements
planned for next releases
planned for 2.19.0
LibraryImproved documentationSome suggestions for improved documentation:
Improved garbage collectionThe garbage collector should have the notion of long lived and short lived objects. Long lived objects will not be checked for circular references that often. Short lived objects will be checked for circular references a few times, and then upgraded to long lived.Some objects, such as script functions and global variables, will always be marked as long lived. Improve Getters and Setters for propertiesAllow registered get methods to use output parameter instead of return value to return the property value.Allow the property accessors to access a real property of the same name without going into a loop. Allow use of property accessors with indexes to wrap native C++ arrays. Suggestion by Bismuth Allow use of compound assignment with property accessors, where possible. Forum thread. If the property is an object and the object has the compound assignment operator, what should be done then? For primitives we know what the result should be, but for object we do not so we cannot rewrite it to 'o = o + e', since the += operator may not do a simple addition. More options for dynamic modulesAllow the compilation of new script sections to existing modules. The code should be merged with the already built code. When the functions overlap, they should either be replaced, or an error should be given.I'm not quite sure how this will work, but it is something I would like to add. Improved memory usageRight now the compiler will compile all the code in a script, whether it is used or not. It would be great if the application could tell the compiler which would be the entry points, so that the compiler could simply discard the rest of the stuff thus freeing up memory.Some other stuff that isn't necessary to keep around is the enums and typedefs that won't be used again unless other compilations are done. Support smart pointersWhat can be done to support smart pointers?Smart pointers is a class that holds a pointer to the real object. The class usually has the -> operator overloaded so that accessing members of the real object works just like accessing members through a normal pointer. The smart pointer may also take care of reference counting. When registering the type controlled by a smart pointer it would be necessary to tell AngelScript that it is a smart pointer. A new special behaviour is necessary to register the -> operator so that the VM can automatically access the true object. When registering the members of the object it would also be necessary to indicate which members are accessed through the -> operator, and which are accessed directly. How would it work to pass objects of this type to application functions, or return them from application functions? Usually objects held in smart pointers, are passed as normal references to application functions, but sometimes it really is a reference to the smart pointer itself. It should be completely transparent to the scripts if an object is a smart pointer or not. Custom operator tokensIt would be interesting to allow the application to define it's own operator tokens and the class methods that implement them. The application could then add operators for things like dot product or cross product. It may even be possible to simplify the core library with this.Module access profilesThe application should set the modules access profile per entity, rather than per config group. The access profile is a bit mask where the bitmask for the registered entity will be compared with the access profile for the module to see if the module has access or not to the entity.The access profile can be different for an object type and its methods, thus allowing modules to have access to different set of methods depending on their profile. The object's behaviours are always in the same access profile as the type itself though (or maybe not? It may be interesting to allow different profiles for factories, for example). The config groups will still be removed as a whole. The access profile should be set as a parameter when registering the type or function. The parameter should be optional (with value 0), and will by default add the type/function to all profiles. There will not be any Begin/End functions for setting access profiles. We may use an engine property to set the default access profile, if necessary. The module's access shouldn't be set in the module, but rather be given as a flag to the compiler when the compilation is done. User dataThe asIScriptContext and asIScriptEngine should have a way of registering a function to clean up the user data once the context/engine is destroyed.Improved multithread supportAdd multithreaded support for more platforms. Add documentation for how to turn off multithreading for performance improvements in non-multithreaded environments. We may also be able to optimize the multithreaded code slightly by using code from LDK library, suggestion by Lorien Dunn.Allow applications to disable global variables in scripts through an engine property. This is useful if the application plan on having multiple threads execute scripts from the same module simultaneously and don't want to burden the script writer with having to think about how to control the access to the global variables to avoid memory corruption. Multiple OS fibersLook into properly supporting switch of OS fibers from within script contexts. Perhaps we need to use a fiber local storage for the stack of active contexts in order to permit this kind of switching.Currently if you have two contexts running in different fibers that switch between each others from within asCScriptContext::Execute, the asGetActiveContext() function may not return the correct context. In debug mode it may cause assert failures, as there is a check when popping the active context to make sure it is the correct one that is popped. A switch of fibers from within asCScriptContext::Execute may also leave loose reference on the context stack if the application registered function that performs the switch is not properly implemented. To guarantee no loose references, the function should be a void function that takes no arguments. This way the script can only call it as a simple function call, rather than in a complex expression. Optimize stringsThe string constants used throughout the scripts should be instanciated once by the compiler and stored in a pool. The generated code will then receive constant references to these pooled strings. This will greatly improve performance for scripts that does a lot of string comparisons with constant strings, and will also permit application conversion of string constants from 8bit to 16bit for example. Consider adding the primitive type 'text' which is basically a 'const char * char'. The string literals would be of this type, instead of the registered string type. When assigning one 'text' variable to another, only the pointer is copied, without any reference counting. The 'text' type has some tempting advantages in that they are fast and efficient. They would also allow registration of C++ functions that take 'char *' as parameters. However, there are problems that needs to be addressed before taking this step, how should memory management be handled? What if the script stores a 'text' variable somewhere and then discard the module that compiled the text literal? The application writer could also be tempted to write functions that return 'text' literals to the script, how should this be handled? Improved memory management for value typesNow that the application registers types as either reference types or value types, the script engine should take advantage of this and allocate value types on the stack rather than on the heap. The same for the array object.Better validation of registered functionsI would like to provide means for automatically validating the signature of the registered functions. If AngelScript provides an interface for determining the parameter types, based on function id, the application can use templates to automatically validate all of the parameters. It isn't possible to automatically generate the function declaration by means of templates, e.g. should a parameter reference be a &in, &out, or &inout? But it is possible to verify that a AngelScript parameter reference is mapped to parameter reference in C++. Doing this automatically, it will be easier to avoid stack corruption when AngelScript calls the C++ function.Stack corruption happens for example when AngelScript passes a structure by value, but the C++ function expects a reference, or if the C++ function returns a value in memory, but the function was registered as void. Can the calling convention be determined with means of templates? I.e. is it possible to use templates to determine whether a function should be asCALL_CDECL or asCALL_STDCALL? Example code from SiCrane Add a way to obtain the function id of the system function being calledIt may be useful to be able to obtain the function id, of the system function that is currently being called by the context. From the function id the name of the function and its signature can also be obtained. Suggested by Jeff Slutter.Registered comparison operatorFor application registered types that register the comparison operators directly the engine could automatically construct an opCmp method so that the compiler doesn't have to bother with all different behaviour types.If only the equality operator is registered then only the opEqual function will be supported. To be able to do all comparisons, one equality behaviour and one relational comparison behaviour is needed. Example: "less than" and "equal". The function will first compare with "less than", and if not less, it will compare with "equal". All script code should be resumableCurrently there are moments where script code is executed but the application cannot control the execution, e.g. with the Build() call where the global variables are initialized with a call to the internal @init script functions, with the CreateScriptObject() where the constructor of a script class may be called, with script array resizing where the constructor for each of the elements are called.Some changes will be necessary to give the application complete control over script code that is executed. CreateScriptObject and Build will be able to use an application supplied context instead of creating its own. This will be done similar to how ExecuteString currently does it. The script array object will also be modified to use a script function to perform the initialization of the elements. Script code can then call that script function directly instead of calling a C++ function that has to create a new context to initialize the elements. The asIScriptArray interface will of course maintain the C++ resize method, but it can internally call the same script function to initialize the elements, and permit the use of an application supplied context. Improving context interface for calling script functionsI need a single function that can be used to set any argument. Something like SetArgValue(int index, void *ptrToValue, bool takeOwnership = false).When the argument is an object handle and takeOwnership is false, the context will automatically increase the reference count. If takeOwnership is true, the context won't increase the reference count, which means the application must not release it. When the argument is an object by value, the context will automatically create a copy of the object, unless takeOwnership is true, in which case the context will store the object pointer and then release it upon completion. Shared type idsCurrently, if interfaces are declared equally in two different modules they will share the type id, so that objects that implement this interface can be exchanged between the modules without any difficulties.I need to expand this to other types, such as typedefs and enums, as well. Classes are a bit more complicated, because they come attached with function implementation, which may access global variables, or other types that are not shared, etc. Improved portability
AngelScript calling conventionObjects parameters All objects should be passed by reference, as if it was a const reference. The script functions that access object parameters for modification will copy the object first and then access the copy instead. Application registered functions that register parameters with non-const input reference should be wrapped automatically, by the compiler to pass a copy of the object to the parameter. This will allow me to remove parameter references from the script language, without loosing the performance benefits of using const &in references. Parameters can still be declared as const, but it won't make a difference to the caller, as it will always pass parameters expecting them not to be modified. This may affect function overloading. Object handles The caller should be the owner of the object handles passed to functions. This will reduce the number of addref/release calls made during script execution. It will also simplify the registered application functions, as they will no longer have to release the handles received in parameters. The script compiler will need to make sure that all object handles passed to functions are stored in local variables (or parameters in the function). This is similar to how the compiler currently guarantees the validity of references. In order to avoid abrupt changes to applications I'll implement this feature as an engine property: asEP_CALLER_OWNS_HANDLE. The value will be false by default. After a few releases I can change the default to true. Then after even more releases, the engine property can be removed all together. Improved object type registrationLook into what to do with asOBJ_NOHANDLE and ALLOW_UNSAFE_REFERENCES. Currently the engine permits the types to be passed by reference. It probably shouldn't allow that, but is it necessary to block it? Can we determine the asOBJ_APP flags through templates without loosing portability? Cloning modulesCloning modules will create an identical copy of the first module. The result should be the same as if a second module had been compiled with the same script.The function ids for each module will be different, as a function id is a global id the uniquely identifies a function in the engine. The same goes for object types. This is especially important when the functions/classes accesses global variables in the module. Interface ids will however remain the same for both modules. A module can also be cloned by saving the bytecode from one module and then reloading it into another. But if the script engine can clone the modules directly it will be faster than having the application take care of it. A common super class for all script classesMaybe a super class for all script classes can be implemented. The super class, won't have any properties or methods. It is only used as a common base, so that all classes can be stored in a common way. It could also be seen as an interface that all classes implement.Potential drawback: Application registered classes won't always be derived from the super class, thus separating them from the script classes. Advantage: All script classes will have a common denominator, facilitating the storage of them. An application registered class that derive from an AngelScript class interface, can be registered to have this same common denominator which should make it possible to have script classes inherit from application registered classes. Evaluate the possibility of using dynamic callsIt might be possible to implement something like dispatch calls in AngelScript. This is something that would work for any object, and may make it easier to write scripts for some situations. What the compiler would do is something like this: 1) build a list of arguments, including a reference to the object, as well as the name of the function. 2) call a built in dispatch function. The dispatch function would then check the object type to find the matching method and calling it. If no matching method is found an exception is thrown. Combine this with the generic superclass Object that all classes inherit from and you have a completely dynamic type. A special dispatch operator would be needed so that the script writer can choose compile time or run time resolution of the method. Possible syntax: object->function(23, 43)Investigate possibility of allowing reference cast behaviours for value typesCurrently it is not possible to establish class hierarchies for value types. What would the implications of allow this be? Is there any danger of allowing a temporary reference cast?Improve template classesMust be possible to register template types with more than one subtype, i.e. map<string,int>.Remove built-in dynamic array objectsRelated to: static arraysThe built-in dynamic array type will be removed. This will happen when I release version 3.0.0. Allow initialization lists for registered typesRelated to: dynamic arraysIt must be possible to register types that accept initialization through initialization lists. This probably needs a new way of registration. Possibly a way to tell AngelScript which constructor should be used, and then which index behaviours.
It would be great if the application could tell AS if a special pattern is needed for the initialization lists, for example
a map
Add support for importing class definitions from other modules, so that a class can be implemented in one module yet used in another.
Adding a UTF-16 code scanner for the compiler might be interesting. This would allow applications to avoid having to convert script code to UTF-8 before compiling it.
This is very low priority though.
It will probably need a new behaviour on the string type, as well as a new
indicator when registering the function that needs the translation.
Suggestion by Manu Evans.
Suggestion by Dentoid.
I've made great improvements on the memory footprint already, but at some time I will probably go back to this problem to see if I can improve it even further.
Statement blocks can be parsed one statement at a time, which will improve memory utilization when compiling large script functions.
Need engine properties to discard information that are not needed for runtime operation, i.e. enum values, type defs, local variable names, line numbers, etc.
For singletons this can easily be done through wrappers as the object pointer can be resolved through a global variable or a static method to retrieve the singleton pointer.
Where the class method is not for a singleton it might be a little more difficult to solve with wrappers as it will be difficult to determine the object pointer. Because of this the best solution may actually be to implement this support in the core library after all.
A better name may be asIExecutor, since the context really is that, an interface for executing script functions.
The context holds the environment in which threads are executed. Thus it will
keep track of global variables and memory allocations, shared between threads.
A thread will only control the execution of a script, i.e. hold the call stack
while it is running. Thus it will not be necessary to keep the call stack in
memory when it is not used.
Unless, non-shared global variables are implemented, this redesign is pretty
useless as the contexts will not hold anything.
A context property is a non-shared global variable registered from the
application. After creating the context, the context property must also be
set, otherwise a null pointer exception will occur.
I'm not sure if this is actually useful, so this will probably not be implemented.
The advantage of this is that it would be easier to write a common script compiler, without having to share all the code with the application that will use the scripts.
This is something that I want to do someday. Though I'm not sure how well it will work with AngelScript being so closely tied to C++. Pointers wouldn't work for example, as that is too low-level. The boolean type is also not equal on all platforms.
Things that are currently making the code platform dependent:
Could perhaps have a an engine property to tell the compile to produce some platform independent intermediate bytecode, rather than the bytecode that the VM will execute. When loading this pre-compiled independent intermediate bytecode, the loader would need to convert it to the final platform dependent bytecode.
The script language should allow this too, but it must take care to make sure the reference is valid.
It is already possible to implement enumeration of types, variables, functions, etc, and to create instances of new classes.
It's possible, but complicated, to implement an 'invoke' function that calls an arbitrary function/method with arbitrary arguments. To support this I need to implement support for variable argument lists.
It's not yet possible to dynamically create new functions from within the scripts. For this to be possible the engine needs to be able to compile single functions in a module, and have support for function pointers.
Modern languages such as D, Java, C#, and most script languages treat all reference types by handle, i.e. all variables of these types are only handles to instances, and all functions parameters are also by handle.
The one thing you lose with this is an explicit assignment operator for copying content as the assignment operator will perform a handle assignment rather than copying the content of the object. (Though it may be possible to create another operator for this.)
The advantage of implementing this would be a cleaner language syntax, in that the @ can be completely removed, thus also simplifying the compiler.
I haven't decided whether I really want to implement this, but I don't think it would be too difficult. If I decide to implement this, it would be for version 3.0.0 together with the change to the parameter references.
Note, AngelScript already supports implicit handle types by turning on the engine property asEP_ALLOW_IMPLICIT_HANDLE_TYPES.
It would also be easier on the script writer I believe.
If I do this, I have to remember that variables should not just be initialized to zero when entering a new function. The variables must be initialized to zero every time a new scope is entered, e.g. each iteration of a loop.
This might be solved by using push and pop to handle the scope. When push is used to increase the stack space it will be incremented with zeroes.
I may be able to do this initialization only if the compiler discovers that the variable is used without being initialized first. That would avoid unecessary bytecode being generated, and thus not impact performance.
Taking the address of a function must be able to resolve function overloads automatically by checking the type of the variable where the address will be stored.
It must be possible to declare function definitions that use types declared below it in the script.
This also means that global variables will be initialized in the order they are declared, and that a global variable doesn't see variables declared after it.
Functions and class declarations will still be seen from above themselves. This also means that if a function is called from the top of the file, it can access variables that haven't been fully initialized yet (it will only see the default value). In case of accessing object variables, this will result in a null pointer exception.
See also: Parameter references
Allow use of static arrays in the scripts, e.g. int ar[3]. These will be
directly compatible with C++ arrays. They will not be objects, so it will
not be possible to store a handle for a static array, but they can be passed
by reference to functions.
It should be possible to assign part of a bigger array to a smaller array, e.g.:
Look at the D language for more ideas on arrays.
The static array types should be treated as an easy way to declare multiple variables that can be accessed through an index.
The static array type can have a size that is defined at run time, e.g. with a parameter name. In this case the memory for
the array is allocated on the heap, plus a hidden variable that gives the size of the array. The same thing happens if the
size of the array is fixed, but too large, e.g. larger than 100 elements.
The static array type will only be implemented when the dynamic array type has been removed from the language.
When passing a static array by parameter, or returning it, the size of the array must be fixed. That way it is not necessary
to pass the size of the type as a separate variable.
Ex: void func(int arg1, int arg2 = 0);
I haven't decided yet if I really will implement this.
It will not be possible to store a handle to a struct, nor will the struct be
able to inherit from another class or struct.
I'm not so sure about the usefulness of this anymore.
Reference types must not be allowed in data structures. Handles can be safely used in the structures though.
To guarantee safety in the scripts the implicit copy constructor of a value type must have certain restriction. This is to guarantee that the constructor itself doesn't destroy the structure that is being copied before it is time. Should I prevent the user from implementing a copy constructor, or should I simply automatically generate a copy constructor that will be used for the implicit calls?
Even though pointers cannot be used in an environment where security is wanted, they are still very useful when the scripts are written for prototyping.
With the addition of namespaces the scripts that previously had to be compiled
in separate modules, can now be compiled in one module and still use the same
names. The benefit for this is the better reutilization of common functions.
Modules will still exist, of course.
Yes, I intend to allow implementation of class methods outside the class, just as in C++. I just haven't gotten around to it yet.
The asCDataType object that defines a typedef should have a subtype to the true type. All functions that ask what kind of type the typedef is should return what the subtype returns.
The script will have to add an extra keyword for parameters if the argument
is support to be a return value, something like: func( (out)arg );
Doing this would make the language much more consistent in the way it works, thus easier to understand for the script writers. The impact in performance and to the application developers shouldn't be too big.
AngelScript would still permit the application to register functions that take parameters by reference, but only were it can be translated to the permitted AngelScript syntax. A const type & could just as well be sent by value, and a type & would be translated to either type @ or out type.
This is obviously a bit of a step backwards, so it's not a decision to be taken lightly. If I do it, I'll probably change the version number to 3.0.0. A replacement should also be available at the time this is implemented, e.g. the option to turn on unsafe references, and/or allow use of pointers.
A global variable can be marked as shared between contexts, i.e. all contexts
referencing the same module will access the same memory for that variable.
A non-shared global variable is local to the context, i.e. each context has
it's own instance of the variable.
By default the global variables are shared.
By allowing the scripts to declare non-shared global variables, the application will have a difficult time to control the memory needed for each context. Due to this I may not implement this feature.
If I can think of a way to instanciate the global variables only for those contexts that actually need them, then I may consider this feature again.
The 'any' type will not be substituted for the 'var' type. The 'any' type is a generic container, whereas the 'var' type is a generic primitive capable of storing and working with any primitive type.
The 'var' type should have all the mathematical operators with overloads for all primitive types (+ the string type). It should also have methods like asString, asInt, asDouble to convert the variant to a primitive.
The 'any' type could perhaps be improved to allow storing index values by implementing the index operator.
Of course, since most applications have their own way of creating the contexts, and scheduling the executions. Perhaps it's better to just have automatic pushing of arguments on the context stack.
The context manager should also implement intelligent calls to the garbage collector.
The context manager should also permit debugging, by setting breakpoints and step-by-step execution.
Or is at least not considered at this moment.
ClassTest::ClassTest() : Member1(0), Member2(3) {}
I don't think I'll implement this, since it is a rather obscure syntax, and not really necessary. Especially after all reference types will be treated as handles.
Drawback: The ability to hook up line callbacks for internally used contexts may lead to the attempt of suspending those contexts, which may not be such a good idea. The code that is invoking the contexts may not be resumable, for example if the context is invoked to initialize an array element as part of the resizing of an array of script classes. It would be necessary to block the ability to suspend contexts used for this purpose, but that would in turn kind of defeat the purpose of having a line callback in the first place.
This idea has been discarded in favour of reusing existing contexts to call these functions. Where currently a context is created internally the application will be able to supply the context it wishes to use and thus be able to fully control the execution.
Do I really want to allow this? What would it mean? AngelScript doesn't allow implicit conversions of types to bool. Why should it allow a non boolean type to use the 'not' operator?
|
|