Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return type declarations #1090

Closed
JeffBezanson opened this issue Jul 27, 2012 · 105 comments
Closed

return type declarations #1090

JeffBezanson opened this issue Jul 27, 2012 · 105 comments
Assignees
Milestone

Comments

@JeffBezanson
Copy link
Member

Provide this convenient shorthand:

function foo(x)::T
  ...
  return z
end

for this:

function foo(x)
  ret::T
  ...
  ret = z
  return ret
end
@johnmyleswhite
Copy link
Member

Glad to hear you're planning to do this. Will we be able to specify that another function only accepts as inputs functions with specific types of returned values?

@StefanKarpinski
Copy link
Member

Will we be able to specify that another function only accepts as inputs functions with specific types of returned values?

What you're talking about are function types, and I suspect we'll probably not support that, because covariance/contravariance/subtyping issues get pretty complicated and confusing. What kind use cases did you have in mind for this? We did intend to have this at one point, and the syntax String-->Int would be used for a method that takes Strings and returns Ints.

@johnmyleswhite
Copy link
Member

The use cases are things like optimization functions: you want to insist that the function that purports to return a Hessian at a minimum is returning a Matrix. It's not so valuable that I'd worry about it if building support causes headaches.

@StefanKarpinski
Copy link
Member

There will still always be run-time checks for things like that, and potentially compiler modes that can do inference and warn if you're doing something that looks wrong. But I'm not too sold on adding function types back in. They're a lot of trouble for little gain in a dynamic system like Julia. Of course, in Haskell you simply need them.

@johnmyleswhite
Copy link
Member

Understood. I have no feel for these things, so I'm sure you're right.

@StefanKarpinski
Copy link
Member

Well, I'm not at all sure I'm right, so there's that ;-)

@diegozea
Copy link
Contributor

I agree with like davekong says on #1078

Uint8 + Uint8 = Int64

Looks too strange and unexpected.

I understand what Stefan says on that issue, and I think that something like type declaration of returns values can help. Because maybe its better promote into Int for calculation, but it's good gives to the user the expected type.
Uint8 + Uint8 needs to be a Uint8, even if for calculations are promoted to Int64 (unless you need more bits).

If you have and Array of Uint8 occupying N bytes on memory, in only and scalar multiplication you get an object consuming 4 times more memory...

Or in my case, I defined my bitstype of 8 bits and the following promotion rule [ I'm not sure of let be T, Int or Uint8 ]

-(x::NucleicAcid, y::NucleicAcid)   = int(x) - int(y)
promote_rule{T<:Number}(::Type{NucleicAcid}, ::Type{T}) = T

For simply get uppercase letters, I need to explicitly convert the output. But, if I set the promotion rule to NucleicAcid, I going to get a worse performance how Stefan point out before...

julia> seq = nt"ACTG"
4-element NucleicAcid Array:
 A
 C
 T
 G

julia> seq .+ 32
4-element Int64 Array:
  97
  99
 116
 103

julia> nucleotideseq(seq .+ 32) 
4-element NucleicAcid Array:
 a
 c
 t
 g

I know I can simply definite a function taking a NucleotideSeq and returning on the same type, and user are not going to see the explicit conversion... But. In general is not intuitive.

I'm not sure if return type declaration can be useful here.
...Maybe something like the next ?

promote_rule_IO(::TypeOne, ::TypeTwo) = TypeForOperate, TypeForOutputOnBinaryOperations 

@JeffBezanson
Copy link
Member Author

In the case of arrays, I agree. See #1641 .

@johnmyleswhite
Copy link
Member

Having spent more time working with optimization, I have to say that am now much more interested in one day adding the ability to do dispatch on functions typed by the combination of their input types and their return types. It would be nice to have the ability to write separate definitions for:

  • derivative(f::Real -> Real)
  • derivative(f::Real -> Vector{Real})
  • derivative(f::Vector{Real} -> Real)
  • derivative(f::Vector{Real} -> Vector{Real})

@timholy
Copy link
Member

timholy commented Jan 11, 2013

This is why in all my optimization routines, I pass the gradient into the objective function. Once it's an argument, you can do dispatch on it.

@rsofaer
Copy link
Contributor

rsofaer commented Jun 18, 2013

One reason I'd like to be able to specify return types for functions is that it would improve looking at functions in the REPL. With return types I can much more easily tell at a glance what the functions in the list given by * do.

@evanrinehart
Copy link

If anything this would add a lot to the self-documenting nature of a functions source code. I really miss the type of the return value from the source / documentation of basically every other popular dynamic system. It would also help in mentally planning the body of the function while writing it. Would discourage implementation of annoying behavior like PHP's tendency to return a NULL, or a false, or a normal value. Such behavior could totally be made explicit to the reader with the union types. Looking forward to this feature!

@jperla
Copy link

jperla commented Feb 12, 2014

I'm working on a medium-ish Julia project. Even if this wasn't involved in the type system (i.e. if it just automatically added an assert to the bottom of the function so I don't have to do this manually everywhere), this would prevent a lot of errors and help with self-documentation

@dbettin
Copy link

dbettin commented Feb 20, 2014

I think the big win here is self-documentation. The ability to look at documentation and/or code and immediately understand the type intent is invaluable.

@StefanKarpinski
Copy link
Member

The main question here is monotonicity. It would be a nice property if declaring that foo(::AbstractA, ::AbstractB)::C implies that foo(a,b)::C for all a::AbstractA and b::AbstractB. However, this implies that return types declared for very generic methods apply to more specific methods as well.

@johnmyleswhite
Copy link
Member

Admittedly not the best solution, but why not just document the most specific return type that is returned for any input types? That's what I've done while starting to document some packages.

@nalimilan
Copy link
Member

In one sense, allowing the most abstract method definition to constrain all other specific methods to return objects of the same type is a feature. It makes everything more predictable since you don't have to worry about whether a specific type will give unexpected results. Then, the author of the generic function has to choose the best abstraction level to leave enough room for more specific implementations (or leave it unspecified if needed).

@toivoh
Copy link
Contributor

toivoh commented Feb 21, 2014

The monotonicity constraint is indeed a feature. The main motivation was to use it to let map, broadcast, etc pick a suitable element type for the result in a sane and predictable way. (Unlike exploiting type inference for this, which is somewhat unpredictable)

@StefanKarpinski
Copy link
Member

I definitely agree that it's a good feature. It's one that has to be rather carefully designed, however.

@mauro3
Copy link
Contributor

mauro3 commented Mar 12, 2014

The main thing I missed in Python compared to Matlab was that there are no return "types". Well, what Matlab does is not really return types but still, it specifies which variables are returned which helps both with documentation and correctness. So +1 for this.

What @JeffBezanson originally proposed is just syntactic sugar, so as far as I can tell, it's just a matter of implementing it (or not). Although I would propose to have it as a type-assert and not a type-convert:

function foo(x)::T
  ...
  return z
end

would be equivalent to

function foo(x)::T
  ...
  return typeassert(z, T)
end

(for a discussion of the subtle difference to Jeff's original proposal see https://groups.google.com/d/msg/julia-dev/pGvM_QVmjX4/V6OdzhwoIykJ )

The somewhat related topic which has been discussed here, is whether the types of functions and methods should contain their calling and return signature. I think this a sufficiently different topic that is should actually be a different issue. However, the present issue could be a stepping stone for that one.

@JeffBezanson
Copy link
Member Author

I definitely see the argument for using a typeassert. But a big part of the value of doing a convert is that it makes it easier to write type-stable functions. For example you can write

f{T}(x::T)::T = foo ? x : x+1

and know you have a T->T function, without worrying about weird behaviors that + might have, and without getting nuisance assertion failures.

@StefanKarpinski
Copy link
Member

Keep in mind that LLVM is generally smart enough to figure out that if it has some code that goes from say Int8 to Int8 but passes through Int that it can just cut out the middle man, so conversion is generally more convenient and no less efficient.

@mauro3
Copy link
Contributor

mauro3 commented Mar 12, 2014

I see. Two counter arguments:

  • The :: in the argument part of a function has a typeassert character and will throw an error. Thus it is a bit confusing to have two subtly different semantics of :: so close together.
  • From a perspective of writing more complex numerical functions: then I want type-stable code throughout the function and not a conversion at the end. The typeassert could provide at least some of that certainty. More generally, it could also catch some errors instead of silently converting. Also, presumably LLVM will struggle to optimse more complex functions?

Either way it would be good sugar to have.

(These two answers also clear up some of the questions I had in that referenced mailing list post, thanks)

@adnelson
Copy link

Will this shorthand also be available for the function shorthand syntax?

incr(x::Int)::Int = x + 1

Which, by the way, makes me wonder if Julia might benefit from the syntax x::Float32 = 1 as sugar for x = convert(Float32, 1)? Similarly foo(x::T1)::T2 = bar might desugar to foo(x::T1) = convert(T2, bar)

@JeffBezanson
Copy link
Member Author

Yes, that will be how it works.

Currently x::Float32 = 1 declares x to be a Float32, in a manner similar to C. All assignments to x then call convert(Float32, ...), so we already do that part.

@JeffBezanson
Copy link
Member Author

Ah, that is quite nice!

@StefanKarpinski
Copy link
Member

Yes, that's really nice. On the input direction, this is similar to why I think it would be nice to do implicit conversion to the declared type of an argument, e.g.:

f(x::Nullable{Int} = 0) = x

f() # => returns Nullable(0)

Otherwise you have to write this as

f(x::Nullable{Int} = Nullable(0)) = x

which just seems obnoxiously redundant and unJulian.

@blairn
Copy link

blairn commented Jul 12, 2016

so I'd be able to write something like...

function groupby{A,B}(xs::Vector{A}, f::A -> B)::Dict{B, Vector{A}}
  result = Dict{B, Vector{A}}()
  for x in xs
    key = f(x)
    result[key] = push!(get(result, key, Vector{A}()), x)
  end
  result
end

@mauro3
Copy link
Contributor

mauro3 commented Jul 12, 2016

No, this is not about function types which contain their return type. The best you can do is:

function groupby{A,B}(xs::Vector{A}, f, ::B)::Dict{B, Vector{A}}
  result = Dict{B, Vector{A}}()
  for x in xs
    key = f(x)
    result[key] = push!(get(result, key, Vector{A}()), x)
  end
  result
end

i.e. you'd need to manually pass in an instance of B.

@blairn
Copy link

blairn commented Jul 12, 2016

I was kinda hoping if the function knew its return type, I'd be able to use it as the key in the dictionary.

Either way, I think it is an improvement. If nothing else, it should help with the documentation.

@mauro3
Copy link
Contributor

mauro3 commented Jul 12, 2016

Actually, this works:

julia> fn(g, x)::Dict{Symbol, typeof(g(x))} = Dict{Symbol, typeof(g(x))}()
fn (generic function with 1 method)

julia> @code_warntype fn(sin, 5.6)
# ... looks good

Note that type-inference is fine irrespective of whether you use the return-type annotation.

@blairn
Copy link

blairn commented Jul 12, 2016

That is pretty awesome!

I was hoping I had a way to look at a function and see what it would return for a type.
Sin returns number, but.. without handing it a number and calling it, I can't know it does that - However I'm pretty sure the compiler does know.

Anyway - it is no big deal, it would make some code a bit more efficient, and make a couple of things a bit nicer. But overall? I can work around it pretty easily.

@yuyichao
Copy link
Contributor

fn(g, x)::Dict{Symbol, typeof(g(x))} = Dict{Symbol, typeof(g(x))}()

Not sure what this tries to achieve but FYI g(x) is called twice.

@mauro3
Copy link
Contributor

mauro3 commented Jul 13, 2016

My understanding was that the purpose is to annotate the return type. I'm aware that this comes at the cost of two function evaluations. Is there a way which avoids the two evaluations? (The frowned upon) Base.return_types cannot do it:

julia> fn{X}(g, x::X)::Dict{Symbol, Base.return_types(g, Tuple{X})[1]} = Dict{Symbol, typeof(g(x))}()
fn (generic function with 1 method)

julia> @code_warntype fn(sin, 5)
# looks bad

But yes, probably better to leave the annotation off.

@dataPulverizer
Copy link

Hi guys, I recently came across the Sparrow programming language a static compiled programming language with very simple syntax, but with hper-metaprogramming capabilities, that is being able to go from runtime->compile time. I think that this is revolutionary step forward in computing and his approach would solve the challenges Julia has been facing. Since the language is static, it already has return types but it does not suffer from the problems caused by the compiler not recognising types for example the challenges with data frame (http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/).

Its something the D community has been seriously looking at (https://forum.dlang.org/thread/[email protected]). You can find the link to the creator's PhD thesis - one of the best texts I've read on programming period. I think this approach is something to be seriously considered.

@JeffBezanson
Copy link
Member Author

Since the language is static, it already has return types

I might be missing something not having read all the details, but this sounds to me like a well-understood tradeoff: yes, with a static type system you can have arrow types, and instead of performance problems from a lack of type information, you get a compile-time error.

The case of DataFrame getindex could be solved, for example, with something like a @generated function that could also see (constant) argument values.

@johnmyleswhite
Copy link
Member

I'm hesitant to comment on an old, closed thread, but I would note that several of us have converged on translating DataFrames into row-iterators that generate rows as well-typed tuples (which is possible now that we have Nullable scalar objects that can handle missing values). I think it's likely that we'll simply remove getindex from a future variant of DataFrames rather than try to improve on it.

@dataPulverizer
Copy link

I didn't realise that the DataFrame issue was solved. @JeffBezanson why would you get a compile-time error?

@dataPulverizer
Copy link

@johnmyleswhite does this mean that in the future DataFrames will be able to efficiently process tables with columns of arbitrary type or will the types be bound to a specific set?

@johnmyleswhite
Copy link
Member

Let's have that conversation elsewhere and in a few months from now. :)

@dataPulverizer
Copy link

Fair enough

@JeffBezanson
Copy link
Member Author

why would you get a compile-time error?

That's generally what happens when a compiler for a statically-typed language can't figure out the type of something. Which will happen eventually. To say much more we'd have to drill down on what "approach" you're talking about more specifically. For example Sparrow supports calling functions on constants at compile time, but the name-to-type mapping in a DataFrame isn't a compile-time constant.

+1 To row iterators. I've had success using tuples and NamedTuples with that approach.

@dataPulverizer
Copy link

As far as I can see from his thesis he has developed semantics for the user to decorate what should be done in compile time and run time as well as the default state where the compiler works out what should be done depending on the nature and typing of the inputs. "If a metaprogram has bugs, it will cause the compiler to crash or behave incorrectly" as would be the case in a run-time system. But I think he is the best person to speak about this aspect

@dataPulverizer
Copy link

I think however, it would certainly be possible to create rules that deal with this aspect as a sort of compile time exception handling

@davidanthoff
Copy link
Contributor

I have a completely type stable row iterator for DataFrames in Query here. It uses NamedTuples to represent rows and works great. You can write arbitrary queries against DataFrames with Query and there are no type instability problems because that iterator essentially solves that problem

@lucteo
Copy link

lucteo commented Sep 8, 2016

@JeffBezanson one of the major points of statically-typed languages is to get errors. They show you that most probably you got something wrong. In my view, if you want efficiency, you would better be in this category of languages. Moreover, I would argue that it's really important for one to easily figure out how a particular code would be translated in the machine language (at least to some degree). If the language is allowed to do a lot of "clever" things that makes it harder for one person to actually understand the performance characteristics of the code. That inevitably leads to less efficient code. I think the example illustrated by @johnmyleswhite in http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/ is perfect here.

If, on the other hand, the language allows you to write simple code that everyone understands how it translates to machine code (again, to some degree), then, the language must make it pretty clear what run-time is. That means the language needs to create a clear distinction between run-time and compile-time.

Yes, the distinction between run-time and compile-time can be annoying in some cases, but I would argue that those cases appear very seldom in practice. After all, we know that our programs will never be like "compiler, please solve my problem (and you should be able to infer which problem I'm referring to)"

My 2 cents,
LucTeo

@johnmyleswhite
Copy link
Member

Let's not continue this broad design conversation on this issue as every comment sends a notification to many people.

@StefanKarpinski
Copy link
Member

@lucteo: Julia is not currently a static language and won't become one in the future, so I'm not sure what the point of your comments is. The alleged simplicity of having semantically different compile-time and run-time phases is contradicted by the many confusions and troubles that arise from this distinction in static languages: virtual vs non-virtual methods, overloading vs dispatch, etc. – these are some of the most chronically problematic and hard-to-explain issues in static languages.

@JeffBezanson
Copy link
Member Author

I'll reply on the julia-dev list.

@mauro3
Copy link
Contributor

mauro3 commented Sep 9, 2016

@gyk
Copy link

gyk commented Sep 6, 2018

Another problem is if the argument list goes too long and you want to put the ::ReturnType at the second line, Julia complains about invalid syntax: "ERROR: syntax: invalid "::" syntax".

@yurivish
Copy link
Contributor

yurivish commented Sep 6, 2018

If you put the closing ) on the last line then it parses:

julia> f(a, b, c
       )::Int = 3
f (generic function with 1 method)

It may be more aesthetic to do it this way:

julia> f(
           a, b, c
       )::Int = 3
f (generic function with 1 method)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests