Skip to content

Conversation

@sinhrks
Copy link
Contributor

@sinhrks sinhrks commented Dec 17, 2016

This should simplify AtheMathmo/rusty-machine#159 a bit.

Even though normal Vec.sort requires Ord, Vector.sort is available if value impls PartialOrd (to handle floating points).

@AtheMathmo
Copy link
Owner

Looks good to me. I wonder if we can go a step further by allowing the user to specify the ordering function? Maybe an Option so we can default to the current sort?

I'm not sure if this is a good idea or not. The pr as it is can be merged.

@Andlon
Copy link
Collaborator

Andlon commented Dec 17, 2016

I think we should think much more thoroughly about this. Rust has separate PartialOrd and Ord traits for a reason. While the proposed implementation makes sense for a mathematical vector in the case of f32 and f64, it may give some unexpected behavior in cases that we haven't yet considered. I think it works as expected for all primitive types, but could there be cases where the user legitimately has a user-defined type which only admits partial comparison?

For example, consider complex numbers. Complex numbers are not a totally ordered set (with respect to their magnitude) unless you impose some additional ordering constraints. Trying to sort a vector of complex numbers may thus give unexpected results (panic at runtime). This is exactly the reason why the Ord and PartialOrd traits exist in the first place, as the compiler would let you know that the set you are trying to sort does not admit a total ordering.

@sinhrks
Copy link
Contributor Author

sinhrks commented Dec 17, 2016

Ok, how about:

  • add sort and sort_by which simply delegates to Vec.
  • add another sort method which can handle both int and float (defined in each primitives)

@Andlon
Copy link
Collaborator

Andlon commented Dec 17, 2016

Sort of playing the (conservative) devil's advocate here:

  • Vector is primarily a mathematical entity. Of course, its API also accommodates some non-mathematical operations that are frequently used in practice when working with mathematical vectors.
  • "sorting" is not a mathematical operation. Rather, it's an operation on the underlying data, and so I think .data().sort() is even semantically a better choice than incorporating custom methods.

Finally, sorting floating point numbers in this fashion should probably be something like a "try sort", i.e. it should rather return a Result indicating success or failure rather than simply panicing, since whether the floats are sortable or not is a property of the data (which in principle could come from any source), not an easily checked structural a priori property (such as the dimension of a vector, for instance). Alternatively, an even better choice might be to simply decide on how to order NaNs relative to other floats, see for example how the quickersort crate does this.

@AtheMathmo
Copy link
Owner

I agree with most of what @Andlon has said. Although right now we do not have the means to sort the Vector in place. This is because vector.mut_data() returns a &mut [T]. We do not return a &mut Vec or the size of the data could be modified (and we maintain a size field).

I think that being able to sort the vector in place seems like a useful operation. Though as pointed out above a somewhat complicated one in practice for us.

Personally @sinhrks suggestion seems reasonable to me. Where possible allow us to use the std::Vec sort function and figure out the more complicated cases (maybe later).

@Andlon
Copy link
Collaborator

Andlon commented Dec 17, 2016

@AtheMathmo: sort is a method on primitive slices though, not on Vec!

@AtheMathmo
Copy link
Owner

Ah yes!

I'm not sure why I was under the impression this was not the case. Is there a particular reason why this is not enough for you @sinhrks ?

@sinhrks
Copy link
Contributor Author

sinhrks commented Dec 18, 2016

Thx to the feedback. v

What I currently wants are argsort and percentile, and intends to use top level sort. I'll use utility func to impl them.

@sinhrks sinhrks closed this Dec 18, 2016
@robsmith11
Copy link

So is there an easy way to sort a Vector<f64>? I understand the that the standard Rust Ord is very strict about NaNs, but for a numerical library like this, wouldn't a more convenient handling be appropriate?

Also argsort would be very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants