More

Ninjaneered · on Sept 23, 2020

Good question, I haven't watched the Limiting Factor video posted above yet, but I did find this [1] on Ars Technica:

> For example, in a cylindrical battery, the cathode and anode are wound tightly around one another. In a conventional battery, a "tab" sticks out of each side of this roll—one connecting the coiled cathode sheet to one end of the cell, the other connecting the anode to the opposite end. Tesla says it has pioneered a new "tabless" internal structure that is not only less prone to overheating, but is also easier to manufacture. Eliminating the tabs means there's less need to start and stop the manufacturing process to make sure the tabs are properly positioned in each coil.

[1] https://arstechnica.com/cars/2020/09/how-tesla-plans-to-make...

Ninjaneered · on Sept 2, 2020

If you're willing to install extensions, MediaWiki has some options that you might be interested in. MediaWiki has a visual editor in addition to mark-up, my preferred way to configure it remembers which editor the user used last, but they can switch when they want. Out of the box, MediaWiki is fairly bare bones, but if you're willing to spend some time on the set-up, it's pretty powerful. For example, page views were removed in MW 1.25, but can be added back using the HitCounters[1] or WireTap[2] extensions. However, the extension I think you might like is called WatchAnalytics[3], it's able to demonstrate how "watched" a wikis pages are, who's watching them, how quickly they're reviewed after being edited, and more. It's really helpful to build a consensus at a company about what's known and provide the metrics to encourage review.

[1] https://www.mediawiki.org/wiki/Extension:HitCounters

[2] https://www.mediawiki.org/wiki/Extension:Wiretap

[3] https://www.mediawiki.org/wiki/Extension:WatchAnalytics

fuball63 · on Sept 2, 2020

Thanks for the tip!

Ninjaneered · on Sept 2, 2020

One more tip, you might try using Meza[1], an automated MediaWiki install/deployment tool to get you up and running quickly. It comes pre-baked with several extensions[2] including visual editing and the extensions I mentioned above (WireTap & WatchAnalytics). It was actually created by a talented team at NASA and makes things much easier (including suggested extensions for enterprise use) if you have access to a server.

[1] https://www.mediawiki.org/wiki/Meza

[2] https://www.mediawiki.org/wiki/Meza/Extensions_installed_by_...

Ninjaneered · on July 3, 2020

Can you explain why this is preferred? Is it to ensure everyone knows where it comes from (journal or preprint)?

JadeNB · on July 3, 2020

For me, at least two reasons:

1. As you say, to see more information about where the paper comes from.

2. It's easy to get from the abstract page to the PDF, but not vice versa.

Personally, I also think it's good for people to get into the habit of linking to a text description of data-heavy resources rather than directly to the resources. PDFs aren't that data-heavy, but there are plenty of other things that are that could do with a text landing page, and I think it's good to get in that habit.

Ninjaneered · on July 3, 2020

Makes sense, thanks!

Ninjaneered · on July 2, 2020

For reference, this is from the same developer [1] that created Semantic MediaWiki [2] and lead the development of Wikidata [3]. Here's a link to the white paper [4] describing Abstract Wikipedia (and Wikilambda). Considering the success of Wikidata, I'm hopeful this effort succeeds, but it is pretty ambitious.

[1] https://meta.wikimedia.org/wiki/User:Denny

[2] https://en.wikipedia.org/wiki/Semantic_MediaWiki

[3] https://en.wikipedia.org/wiki/Wikidata

[4] https://arxiv.org/abs/2004.04733

xiler · on July 2, 2020

He also works on Google's Knowledge Graph

https://research.google/people/vrandecic/

https://storage.googleapis.com/pub-tools-public-publication-...

O_H_E · on July 2, 2020

Damn. Big kudos to Denny.

And to all the other people doing awesome work but not on the top of HN.

gcbw3 · on July 3, 2020

Considering the close relationship with Google and Wikimedia https://en.wikipedia.org/wiki/Google_and_Wikipedia and the considerable money Google gives them, how can one not see this project as "crowdsourcing better training data-sets for Google?"

Can the data be licensed as GPL-3 or similar?

nl · on July 3, 2020

That's an incredibility zero-sum way of looking at the world.

Almost every research group and company doing NLP work uses Wikipedia I'd say it is a fantastic donation by Google which improves science generally.

> Can the data be licensed as GPL-3 or similar?

It's under CC BY-SA and (with a few exceptions) the GNU Free Documentation License.

bawolff · on July 3, 2020

I dont think the relationship is that close - all it says is google donated a chunk of money in 2010 and in 2019, it was a large chunk of money(~3% of donations) but not like so much to make a dependency.

> Can the data be licensed as GPL-3 or similar?

Pretty unlikely tbh. I dont know if anything is decided for licensing, but if it is to be a "copyleft" license it would be cc-by-sa (like wikipedia) since this is not a program.

Keep in mind that in the united states, an abstract list of facts cannot be copyrighted afaik (i dont think this qualifies as that, wikidata might though)

zozbot234 · on July 3, 2020

How so? Wikimedia-provided data can be used by anyone. Google could have kept using and building on their Freebase dataset had they wanted to - other actors in the industry don't have it nearly as easy.

antonii · on July 3, 2020

Denny seems to be leaving Google and joining Wikimedia Foundation to lead the project this month, so probably you do not need to worry too much about Denny's affiliation with Google.

9nGQluzmnq3M · on July 3, 2020

As a long-time Wikipedian, this track record is actually worrisome.

Semantic Mediawiki (which I attempted to use at one point) is difficult to work with and far too complicated and abstract for the average Wiki editor. (See also Tim Berners-Lee and the failure of Semantic Web.)

WikiData is a seemingly genius concept -- turn all those boxes of data into a queryable database! -- kneecapped by academic but impractical technology choices (RDF/SPARQL). If they had just dumped the data into a relational database queryable by SQL, it would be far more accessible to developers and data scientists.

mmarx · on July 3, 2020

> WikiData is a seemingly genius concept -- turn all those boxes of data into a queryable database! -- kneecapped by academic but impractical technology choices (RDF/SPARQL). If they had just dumped the data into a relational database queryable by SQL, it would be far more accessible to developers and data scientists.

Note that the internal data format used by Wikidata is _not_ RDF triples [0], and it's also highly non-relational, since every statement can be annotated by a set of property-value pairs; the full data set is available as a JSON dump. The RDF export (there's actually two, I'm referring to the full dump here) maps this to RDF by reifying statements as RDF nodes; if you wanted to end up with something queryable by SQL, you would also need to resort to reification – but then SPARQL is still the better choice of query language since it allows you to easily do path queries, whereas WITH RECURSIVE at the very least makes your SQL queries quite clunky.

[0] https://www.mediawiki.org/wiki/Wikibase/DataModel [1] https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Fo...

boxed · on July 3, 2020

The sparql api is no fun. Limited to 60s for example is death. I had to resort to getting the full dump.

zozbot234 · on July 3, 2020

How do you dump general purpose, encyclopedic data into a relational database? What database schema would you use? The whole point of "triples" as a data format is that they're extremely general and extensible.

9nGQluzmnq3M · on July 3, 2020

Most structured data in Wikipedia articles is in either infoboxes or tables, which can easily be represented as tabular data.

  Table country:

  Name,Capital,Population
  Aland,Foo,100
  Bland,Bar,200

Now you need a graph for representing connections between pages, but as long as the format is consistent (as they are in templates/infoboxes) that can be done with foreign keys.

  Table capital
  ID,Name
  123,Foo
  456,Bar

  Table country
  Name,Capital_id,Population
  Aland,123,100
  Bland,456,200

mmarx · on July 3, 2020

> Most structured data in Wikipedia articles is in either infoboxes or tables

Most of the data in Wikidata does not end up in either Infoboxes or Tables in some Wikipedia, however, and, e.g., graph-like data such as family trees works quite poorly as a relational database; even if you don't consider qualifiers at all.

zozbot234 · on July 3, 2020

Those infoboxes get edited all the time to add new data, change data formats, etc. With a relational db, every single such edit would be a schema change. And you would have to somehow keep old schemas around for the wiki history. A triple-based format is a lot more general than that.

LukeEF · on July 3, 2020

RDF shouldn't be lumped in with SPARQL

tasogare · on July 3, 2020

That’s the same set of technology. SPARQL is used to query RDF graphs, that’s pretty tightly coupled.

Ninjaneered · on Feb 6, 2020

The Enterprise MediaWiki Conferences are a great way to learn best practices about Knowledge Management within organizations. If you have a wiki within your company/organization or think you should, you'll meet a lot of passionate developers, administrators, and users. This one has the added bonus of getting a tour of NASA as well.

Ninjaneered · on Dec 18, 2019

Here's the link to the draft:

https://en.wikipedia.org/wiki/Draft:Apache_Arrow

And some possible additional sources:

* https://www.forbes.com/sites/forbestechcouncil/2019/09/24/dr...

* https://www.businesswire.com/news/home/20180906005114/en

* https://thesiliconreview.com/2016/02/apache-arrow-is-the-new...

tptacek · on Dec 18, 2019

The first article is a paid promotion piece, which WP won't accept as an RS.

The second is a press release by Arrow's sponsoring company, which, obviously, WP won't accept as an RS.

I have no idea what "The Silicon Review" is; this is the first time I've ever seen it. To the extent it's not a pay-to-play trade publication, it might qualify as a notability-establishing source. The fact that the "Review" does not itself have a WP page might make it harder to claim it's reliable, since it suggests nobody else knows what it is, either.

Ninjaneered · on Dec 19, 2019

Looks like my lateral reading was sub-par (actually I didn't even try, just a quick Google/post).

The "Silicon Review" one looks like a pay-to-play as well after further review, it's used in citation on a few other Wikipedia articles, but as far as I can tell, and due to some anecdotal stories, it doesn't look good.

* https://www.reddit.com/r/PublicRelations/comments/bha6hs/sil...

* https://arpr.com/blog/4-pay-for-play-scams/

Good catch, thanks for spending the time to review my links. Reading your comments above, I largely agree. It's a high bar (mostly) to get an article on Wikipedia, and that's a good thing. It allows us to read the majority of content on Wikipedia without too much suspicion.

SquishyPanda23 · on Dec 18, 2019

I read the draft.

Maybe this is an unpopular opinion, but it's obvious advertising and has no place on Wikipedia. Maybe a Medium post would be more appropriate.

Wikipedia already has a problem with bad software articles like this.

JohnFen · on Dec 18, 2019

I mostly agree. It is distinctly marketing-flavored, although not to a degree that I think should disqualify it alone.

What I think should disqualify it is that it's missing a lot of detail that would make the entry genuinely useful. As it is, it's as useful as a press release. Also, it does appear to have a problem with appropriate references.

Generally speaking, I have a hard time disagreeing with the reasons listed on that page for the rejections.

Ninjaneered · on Dec 3, 2019

Great idea and nice website!

Seems like this could integrate well with an enterprise wiki (attempt to document what is in the employees heads).

Ninjaneered · on May 16, 2019

> despite nobody wanting to live besides a nuclear reactor

I certainly wouldn't say this is true. I live 13 miles from Diablo Canyon Power Plant. Not only is its proximity a non-issue for the vast amount of us in San Luis Obispo county, the economic vitality it injects into our economy (high paying jobs, taxes, etc.) helps make our area very desirable to live in. For reference Diablo gives us $22 million in local property taxes annually, of which about $8 million per year is allocated to the school district. It's going to be tough when it shuts down in 2024/2025.

Ninjaneered · on April 26, 2019

> Decentralized, anonymized reputation management. Imagine an ebay score that couldn't be owned by ebay, or any another company

This reputation system is one application I'd really like to see in the future. It's silly that we have sepearte silos of reputation (eBay, Amazon, credit scores, etc.) that are each controlled by a company. In our increasingly global world, who would we trust to govern this system? Seems like a legitimate application for a blockchain (including consensus).

Ninjaneered · on March 11, 2019

So, didn't really pay attention to the article, but wow, that spinning mask[0] in interesting!

Somehow, the effect doesn't work looking at only the forehead[1] which I would have assumed means because my brain doesn't recognize it as a "face". However, the effect does work for me still only looking at the neck[2] which I can't explain other than something with the shading?

Cool stuff!

[0] https://i1.wp.com/slatestarcodex.com/blog_images/spinning_ma...

[1] https://www.dropbox.com/s/io9nsrpp52waxq5/spinning_mask_uppe...

[2] https://www.dropbox.com/s/te0s3wgf6rilf4a/spinning_mask_lowe...

alephr · on March 11, 2019

the surface normals don't change when the mask rotates from the front to the back so the lighting appears to "change direction". I think looking at the edges gives you more cues to see that the mask is rotating vs only having the reversing lighting cues in the center