Ochethi Shakowin Data Sovereignty — Lakota Language Reclamation Project

When Covid-19 hit Lakxota (I will be using the Txakini-Iya Wowapi created by First Language Speaker, Teacher, and Linguist Txunwin Violet Catches for any Lakxota words in this writing) lands around the Spring many of us -- those who had the privilege to do so, anyway -- stayed home and did our best to follow the CDC protocols and guidelines. We spent hours at home trying to keep ourselves busy and sane by finishing those home projects we put off years ago, experimenting with cooking, face-timing and annoying our relatives, among any number of other endeavors.

Those tasks, large and small, happened because we found ourselves in new territory: we had time on our hands. With this newfound time, I thought -- after a couple of weeks of being absolutely lazy -- "all right, yes, I can finally work on my language," even though I had the time before.

This time, though, I stuck to it and started sorting through old texts, transcriptions, transliterations, Lakxota grammar, stories, and whatever else I could get my hands on. I spent hours in front of my computer with different books and resources.

Sorting through the Grammar book put out by The Lakota Language Consortium which was founded by a Czech person and an Austrian person, I came across something interesting in the opening cover of one of their books, “All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher.” The thought came to me: Who has the right to copyright things in our language?

Unci Delores (my grandmother) was used as a primary source in many of their resources including their grammar book. As a Lakxota, not only are these Unci's words and the words of our ancestors but they are also mine, my children's, my unborn grandchildren's; they do not belong to the Lakota Language Consortium.

In the copyright declaration, it says that none of it should be shared “without permission in writing from the publisher.” Why should I have to ask an outside entity if I can use my own language, especially words and knowledge from my own Grandmother? Whose permission did they get to do this? Which Tribal entities did they get permission from? How much money was made off these words that do not belong to these outside groups like the Lakota Language Consortium?

*****

These questions surrounding copyright led me to interesting places, interesting people, and interesting discussions, even if I, unfortunately, upset a lot of people speaking about these things. And those places, those people, those discussions brought me to something called Indigenous Data Sovereignty.

What is Indigenous Data Sovereignty? According to Indigenous Data Sovereignty and Governance at the University of Arizona, it “is the right of a nation to govern the collection, ownership, and application of its own data. It derives from tribes’ inherent right to govern their peoples, lands, resources.”

But why Data? Why is this a big deal? We never even wrote our language down, so why is this important?

With many of our speakers leaving us, our language is also leaving this world, too. Libraries are leaving us. This virus is devastating to our communities. Many of those who grew up with our language as their first language, these amazing people for whom any word in this writing cannot do justice, are leaving us.

What is to be when many of them are gone? In the coming days, we will depend on what they have taught us, the language they taught us, what they have put in our hearts and our minds, the memories, the feelings. But we also will depend on data-recordings, translations, dictionaries, stories, writing, books, and all of the other pieces that fall under the data umbrella.

*****

I was reassured that the copyright statement I read was “standard” and that the LLC would just be happy that people are using their resources because they were made “for the good of the language.” I just said, “we’ll see.”

In the first few days of the new year, I made a Language lesson in the Language Learning App Memrise. I used the Lakota Language Consortium’s Grammar book, lesson 49 because those are our resources, our intellectual property that they organized under copyright but that they did not get permission to do so with. I posted this lesson in our Language Learning group on Facebook. On January 6, 2021, I received an email from Memrise that said, “We’ve received a copyright infringement notice from the legal owner of the content posted in your course. As explained in our Terms of Use, copyrighted material you don’t have the necessary permission for are not allowed to be posted on Memrise.” I would share the link I was originally provided, but that link leads to a 404 Error Message now.

Again, the Lakota Language Consortium has the copyright. They are, according to memrise, “the legal owner of the content,” so they wrote to Memrise and claimed ownership of this lesson.

Memrise deleted the lesson.

I think it’s odd that a Czech person who plays Indian, as can be seen in a documentary on Vimeo, If Only I Were an Indian, also called Becoming a Native American in the Czech Republic (1995), calling himself Crazy Buffalo in the doc, and a different man from Austria, have more claim over our language than me, a Lakxota/Dakhota, a Lakxota Language teacher, learner, and parent.

We are Hunkpapxa. My family comes from the Txahanp Shica Thiyoshpaye (wrongly called the Cheoxba by the history books). My Lala Shunka Wanbli fought in the Greasy Grass, went to Canada with Sitting Bull, and settled on Standing Rock. My Unci, Iteskala Win, sometimes Itesan Win, was a survivor of Wounded Knee. Way back in the early 1800’s our Grandfather Matxo Ite even shows up in documents fighting against encroachment. Matxo Ite had 6 sons: Iron Horn (my grandfather), Red Thunder, Little Bear, Bear Face, Shave Head, and Rain-In-The-Face. Iron Horn was the father of the first Taken Alive, and this is where our line comes from. We can track the Bear Face-Iron Horn-Taken Alive bloodline through 11 generations.

Not only do they feel they have more claim over our language than me as a Lakxota person, but they believe they have more claim than even my own family. My unci recorded many things for them. After she passed, they shared a recording of her on their YouTube page, LakotaLanguageConsortium. I thanked them for transcribing it and recording it but said please ask my father before sharing. This is the respectful thing to do, as my dad is currently the administrator of my Grandmother’s estate.

In a now deleted response, they said, “We have permission from Delores herself.” They claim they have control over her image and voice, even after her death, because they use Western notions of copyright and ownership and legal threats to continue to control what belongs to the Lakxota/Dakhota people to steward and care for. Instead of being stewards of the language, like so many assumed the LLC would be, their response was callous, disrespectful, and in direct opposition to what Lakxota/Dakhota people would have done or said after a recent loss.

*****

I truly believe we will have first language speakers again soon because of all of the many efforts and hours being put in by language reclaimers throughout the Ochethi Shakowin. To bridge the generations of our elders and our children, however, data will be a huge part of our language Revitalization and Reclamation. I contend that when our elders leave us, whoever controls the data will control our movement. I simply prefer that the reins be in the hands of the Lakxota/Dakhota people and not any outside entities like the LLC.

Right now in Lakxota country, there is a big push to record our elders so we can preserve their language, teachings, and everything they are willing to share, so we can someday hope to pass these things on to the next generation. Make no mistake, this is important; the question as to who will own the data, however, is important, as is how this data will be collected.

In college many of us have found ourselves in the odd predicament of looking up research papers in databases on our people only to find those things blocked by a firewall. If we want to surpass the firewall it will take whatever price set by the publishers to access that database or that article. I cannot imagine our ancestors who sat down with anthros, researchers, or linguists would have agreed if they knew these things were going to be gate kept and sold back to their grandchildren.

Whether we want to admit it or not, data can have huge financial implications. Although most do not view our language and teachings in this way of Capitalism, others do and are exploiting it.

This leads back to the Lakota Language Consortium and how they collect their data. I recently received their “waiver” entitled, “Lakota Language Consortium Sound Recording Release,” that they use. The opening paragraph states [emphasis mine]:

"I, ______________________________, (Speaker) for good and valuable consideration, the receipt of which is acknowledged, give to The Lakota Language Consortium (Recorder), its legal representatives, successors, and all persons or corporations acting with its permission, unrestricted permission to copyright and/or use, and/or publish sound and video recordings of me, and the analog or digital information pertaining to them in all current or future formats, or in which I may be included in whole or in part, distorted in form, or reproductions thereof for any other lawful purpose.”

This waiver gives the LLC permission to copyright and publish things for all current and future formats and the freedom to copyright our elders, our songs, stories, conversations, words, definitions, and sell it back to us.

This is the danger that comes from sharing our language with outside entities. These outside entities could offer to record our elders, have them sign a waiver that strips our elders of their rights, offer them $25 to $50 an hour for their time -- and for many who may be experiencing high levels of poverty, this money may be needed. Now, however, that entity owns that data forever and can sell it back to the people many times over.

*****

In their tax form 990 for the period ending June 2019, it becomes clear that they are making money off selling us back our stories and our language. In “gross receipts from admissions, merchandise sold or services performed, or facilities furnished in any activity that is related to the organization’s tax-exempt purposes,” they earned $200,798 in 2014; $244,105 in 2015; $278,910 in 2016; $234,306 in 2017; and $388,328 in 2018 for a 5 year total of $1,346,447. This portion of the return describes the money they earned after selling admission to their classes, their merchandise and books, and their services in “sharing” the language.

In the last 5 years they have made over 1.3 Million dollars selling our language and selling it as a service. According to their website, they started in 2004, so it follows that there has been a lot more money than that over the years.

Also according to that same tax return, Lakota Language Consortium founder from the Czech Republic Jan Ullrich earned $100,367 in 2018 from the “[e]stimated amount of other compensation from the organization and related organizations.” Wilhelm Meya from Austria, pulled in $96,314 from “Reportable Compensation from the organization,” and another $18,728 from “Estimated amount of other compensation from the organization and related organizations'' for over $115,000 in 2018.

That is a lot of money especially in comparison to the amount of honorarium an elder, the true authorities and experts of our language, might receive for trying to help by recording a narrative. It’s also a lot of money in comparison to the per capita income from counties on the Rezzes. In Corson County on the South Dakota side of Standing Rock, the county where I live according to the United States Census Bureau, the average salary for an individual (per capita income) is $16,449.

*****

On May 6th, 2017, The Economist (labeled as “neutral” in terms of bias and as “most reliable” by the Media watchdog adfontesmedia.com) spoke about the profitability of data in an article titled “The world’s most valuable resource is no longer oil but data,” saying data is “the oil of the digital era.” The article goes on to say that the 5 most valuable listed firms in the world are Alphabet (Google’s parent company), Amazon, Apple, Facebook, and Microsoft.

For many, it seems ludicrous that Big Tech would be coming after Lakxota and Indigenous data. The reality is that they already have.

With an iPhone, you can use Lakxota Language settings, the Lakxota months (or, at least, someone's interpretation of them because they are all different), and the Lakxota days of the week using the New Lakota Dictionary orthography popularized by the Lakota Language Consortium. Who gave Apple the Lakxota/Dakhota people's tech and data? Was it the Lakxota/Dakhota people? Was it a Tribal entity? What arrangement was made? Who profits from it? Who owns this data? Is this now theirs forever to do with what they wish? Also, if there are opportunities for our data in tech, shouldn’t those opportunities first be offered to a Lakxota/Dakhota person?

Similarly, a very cool app called Stellarium Plus allows you to hold your phone up to the stars, move your phone around the skies, and chart constellations, including Lakxota/Dakhota constellations. Although I enjoy this app, I still wonder, where did they get the data to do this? Whose permission did they get? What was the arrangement? At the time of this publication however, they have since removed the Lakxota/Dakhota feature.

As I thought about these questions about data, oftentimes out loud and on various social media sites, a friend of mine sent me to a website, led by the Māori’s data department. The website is full of valuable resources including a video called, "Te Reo Māori Speech Recognition: A story of community, trust, and sovereignty."

Presented at the 2020 Natives in Tech Conference by Hawaiian Language Reclaimer and Māori ally, Keoni Mahelona, he asks the question, "Who owns the data?" Mahelona goes on further to present the idea that the question of who "owns" the data is the wrong question to ask because it exists in two different outlooks: Western ideas of ownership (seen in copyright) and Indigenous ways of guardianship. He argues that we should look after data how we looked after the land: guard it and protect it for the right causes that are culturally appropriate and that the People should be the ones in control of it, not any outside entities.

The goal of our language revitalization is no question: to restart the intergenerational transmission of our language and have first language speakers again. Mahelona says that if we want our languages to be ubiquitous (present, appearing, or found everywhere) then it might be beneficial to work with these big tech platforms in order to help facilitate the transmission of our languages. Elders and communities, however, should adopt the royalty model. If the company is profiting off of it and are selling it as a service, our communities should receive royalties from that much like artists who allow platforms to stream their music.

But above all, the solution is Indigenous Data Sovereignty and that we retain the copyrights, and therefore sovereignty, over our language. We the people should be making these decisions on our terms with our protocols -- not any outside entities, and definitely not Big Tech. True Indigenous Data Sovereignty requires that tribes themselves receive first preference, which is a form of affirmative action.

The Māori’s solution? They have adopted and operate under the Kaitiakitanga License which is open-source but with affirmative action, the first choice and ultimate guardianship that Indigenous Data Sovereignty brings. This makes sure the data is collected using their traditional cultural protocols, and that the Māori own and control the data. The website koreromaori.io, which opens to the bold idea of “Indigenous language tools powered by machine learning,” states “Indigenous people do not have a concept of private ownership of land and resources, that's a Western construct by which many of us are required to abide. We see ourselves as the caretakers of our environment and society. Likewise, when we gather data to improve our services, we're taking care of the data given to us, and we follow Tikanga (cultural protocols) when we need to make decisions around using data or providing access to data.” The Māori believe that the Māori should have access to the data, tools, apps, and anything new first. It is the only way for the Māori to compete with outside entities, they say. They say that their Kaitiakitanga License operates sort of like an affirmative action open source. Affirmative action meaning that if there are any opportunities in tech, jobs, apps, or anything related to the data-the Māori shall have preference first. Open source is making sure all Māori, and eventually the public, have access to their data so all who are interested can collaborate, study, teach, and learn their language (while they retain the control over the data). In order for our language to remain in our guardianship, taking similar steps to ensure that the Lakxota/Dakhota people do the same with our language seems appropriate; Indigenous Data Sovereignty is necessary for the Ochethi Shakowin.

*****

In our current heartbreaking and perilous state, these discussions must take place now; we cannot wait. If there are any opportunities from our language data -- careers, jobs in tech (i.e. speech recognition development, app creation, website creation, etc), resource creation, grants -- those opportunities must be offered first and foremost to the Lakxota/Dakhota people. All the data must be returned to the Lakxota/Dakhota people, and we must assert our indigenous ways of guardianship (not ownership) over this data.

If others wish to use our language, they must respect our communities, elders and Tribal and Data Sovereignty by getting permission from us: the ultimate authorities and not colleges/universities, outside entities who organized it under copyright, outside researchers, anthros, and linguists. From us. If outside entities are to profit off of it, they must adopt the royalties model with our elders and communities, always honoring Tribal Sovereignty. Outside entities should not be able to have the power nor authority to disseminate our data with other outside entities however they wish.

The data and the direction of our language revitalization and reclamation must rest in the hands of the Lakxota/Dakhota people with our elders leading, our Lakxota/Dakhota learners and reclaimers sitting shotgun, and our allies riding humbly in the backseat. Enforcing Data Sovereignty is the most powerful step we must take, as Lakxota/Dakhota people to ensure that we sustain our movement into the future.

Wophila txanka echiciyapelo.

Hechetuwelo.

Chatka he miye lo.

(Ray Taken Alive)

Hunkpapxa hemacha.

Txahanp Shica na Kxangxi Ska Thiyospaye ematanhanyelo.

Lakotalanguagereclamationproject.com