Hacker Newsnew | past | comments | ask | show | jobs | submit | Nelkins's commentslogin

There's an open source one by Samsung that is excellent, never had any issues with it: https://github.com/Samsung/netcoredbg

I think I tried that (or a derivative of it, didn't know Samsung was the primary developer) that broke in some very very weird ways. Will try this version, thanks!

There are libraries that simulate a lot of these things (e.g. https://github.com/G-Research/TypeEquality for GADTs). You're absolutely right that it's not as first class as in OCaml, but the tools are there if you want them.


I've written type-equality witnesses in F#, they can kinda recover type equalities (with cast methods) but not refute them, so you still need to raise exceptions for those cases.


Nelknet (https://nelknet.com/) | Senior contractors, frontend and backend | Remote (US & international, but some overlap with clients needed) We're a small, experienced team building software solutions for US-based clients. Recent projects include: - Building ChatGPT-style interfaces for intelligence report analysis

- Developing generative AI features for professional networking platforms

- Platform engineering for logistics/trucking management systems

- Architecture design for real estate search engines

- Implementing RAG pipelines for edtech products

We're expanding our network of senior contractors (both frontend and backend). This is a 1099 contracting relationship - you'll have the flexibility to manage your own schedule and potentially work with other clients while maintaining a professional commitment to our projects.

Requirements: - Strong English communication skills - Senior-level development expertise - Experience with modern tech stacks - Ability to work independently while collaborating effectively - Some level of US business hours availability

If you're interested, please fill out our intake form: https://baserow.io/form/mXa6NqUOKw9CcDPAU37oPh1uOPFHHESph6k0...

If we have a project that is well-suited to your skillset, I will definitely be reaching out!

Looking forward to connecting!


I got an error when submitting the form, "Field data constraint violation", although I could not find which field was incorrect.


Same here


I'm curious about the same, but also am wondering if there can be an automatic election of a new primary through the use of conditional writes (or as Fly.io say, CASAAS: Compare-and-Swap as a Service).


There's also https://github.com/message-db/message-db

Admittedly, these options seem to not be quite as user friendly as OP's solution.


I would love to have an eInk tablet that I can watch videos on (color not required). I frequently watch educational YouTube videos before bed, but I’d prefer to have something that isn’t beaming light into my eyes. Does something like this exist on the market today, or do I need to wait until this product gets released?


That's about the worst use case for existing eink panels, as they have a limited number of switching cycles before the dots start to degrade.


I can’t wait until this is available in more rural areas of NY. I would love to be able to take this thing to/from a bar where there’s no public transport and very low density of Uber/Lyft.


Nelknet (https://nelknet.com/) | Senior contractors, frontend and backend | Remote (US & international, but some overlap with clients needed) We're a small, experienced team building software solutions for US-based clients. Recent projects include:

- Building ChatGPT-style interfaces for intelligence report analysis

- Developing generative AI features for professional networking platforms

- Platform engineering for logistics/trucking management systems

- Architecture design for real estate search engines

- Implementing RAG pipelines for edtech products

We're expanding our network of senior contractors (both frontend and backend). This is a 1099 contracting relationship - you'll have the flexibility to manage your own schedule and potentially work with other clients while maintaining a professional commitment to our projects.

Requirements: - Strong English communication skills - Senior-level development expertise - Experience with modern tech stacks - Ability to work independently while collaborating effectively - Some level of US business hours availability

If you're interested, please fill out our intake form: https://baserow.io/form/mXa6NqUOKw9CcDPAU37oPh1uOPFHHESph6k0...

If we have a project that is well-suited to your skillset, I will definitely be reaching out!

Looking forward to connecting!


Hi, just FYI your form is giving a 400 error (field data constraint violation). Unsure why that is, since I have filled all form elements. Also doesn't let me know which field is problematic.


Whoops, that probably means that you've already submitted your information. There's only one unique constraint on the email field.


Akka.NET and AvaloniaUI are two big ones.


I'll add Orchard CMS to this list. Also a lot of the seven seas software.


Cool, but this is very specific to DataFusion, no? Is there any chance this would be standardized so other Parquet readers could leverage the same technique?


The technique can be applied by any engine, not just DataFusion. Each engine would have to know about the indexes in order to make use of them, but the fallback to parquet standard defaults means that the data is still readable by all.


But does data fusion publish a specification of how this metadata can be read, along with a test suite for verifying implementations? Because if they don't, this cannot be reliably used by any other impl


Parquet files include a field called key_value_metadata in the FileMetadata structure; it sits in the footer of the file. See: https://github.com/apache/parquet-format/blob/master/src/mai...

The technique described in the article, seems to use this key-value pair to store pointers to the additional metadata (in this case a distinct index) embedded in the file. Note that we can embed arbitrary binary data in the Parquet file between each data page. This is perfectly valid since all Parquet readers rely on the exact offsets to the data pages specified in the footer.

This means that DataFusion does not need to specify how the metadata is interpreted. It is already well specified as part of the Parquet file format itself. DataFusion is an independent project -- it is a query execution engine for OLAP / columnar data, which can take in SQL statements, build query plan, optimize them, and execute. It is an embeddable runtime with numerous ways to extend it by the host program. Parquet is a file format supported by DataFusion because it is one of the most popular ways of storing data in a columnar way in object storages like S3.

Note that the readers of Parquet need to be aware of any metadata to exploit it. But if not, nothing changes - as long as we're embedding only supplementary information like indices or bloom filters, a reader can still continue working with the columnar data in Parquet as it used to; it is just that it won't be able to take advantage of the additional metadata.


> Note that the readers of Parquet need to be aware of any metadata to exploit it. But if not, nothing changes

The one downside of this approach, which is likely obvious, but I haven't seen mentioned is that the resulting parquet files are larger than they would be otherwise, and the increased size only benefits engines that know how to interpret the new index

(I am an author)


So, can we take that as a "no"?


There is no spec. Personally I hope that the existing indexes (bloom filters, zone maps) get re-designed to fit into a paradigm where parquet itself has more first class support for multiple levels of indexes embedded in the file and conventions for how those common types. That is, start with Wild West and define specs as needed


> That is, start with Wild West and define specs as needed

Yes this is my personal hope as well -- if there are new index types that are widespread, they can be incorporated formally into the spec

However, changing the spec is a non trivial process and requires significant consensus and engineering

Thus the methods used in the blog can be used to use indexes prior to any spec change and potentially as a way to prototype / prove out new potential indexes

(note I am an author)


The story here isn't that they've invented a new format for user defined indexes (the one proposed here is sort of contrived and I probably wouldn't recommend in production) but rather demonstrating how the user defined metadata space of the parquet format can be used for application specific purposes.

I work on a database engine that uses parquet as our on-storage file format and we make liberal use of the custom metadata area for things specific to our product that any other parquet readers would just ignore.


The Arrow/Parquet community is already discussing standardization via the Parquet format GitHub - this approach intentionally uses existing extension points in the format specification to remain compatible while the standardization discussions progress.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: