Search This Blog

Powered by Blogger.

Blog Archive

Labels

Showing posts with label copyright. Show all posts

Social Media Content Fueling AI: How Platforms Are Using Your Data for Training

 

OpenAI has admitted that developing ChatGPT would not have been feasible without the use of copyrighted content to train its algorithms. It is widely known that artificial intelligence (AI) systems heavily rely on social media content for their development. In fact, AI has become an essential tool for many social media platforms.

For instance, LinkedIn is now using its users’ resumes to fine-tune its AI models, while Snapchat has indicated that if users engage with certain AI features, their content might appear in advertisements. Despite this, many users remain unaware that their social media posts and photos are being used to train AI systems.

Social Media: A Prime Resource for AI Training

AI companies aim to make their models as natural and conversational as possible, with social media serving as an ideal training ground. The content generated by users on these platforms offers an extensive and varied source of human interaction. Social media posts reflect everyday speech and provide up-to-date information on global events, which is vital for producing reliable AI systems.

However, it's important to recognize that AI companies are utilizing user-generated content for free. Your vacation pictures, birthday selfies, and personal posts are being exploited for profit. While users can opt out of certain services, the process varies across platforms, and there is no assurance that your content will be fully protected, as third parties may still have access to it.

How Social Platforms Are Using Your Data

Recently, the United States Federal Trade Commission (FTC) revealed that social media platforms are not effectively regulating how they use user data. Major platforms have been found to use personal data for AI training purposes without proper oversight.

For example, LinkedIn has stated that user content can be utilized by the platform or its partners, though they aim to redact or remove personal details from AI training data sets. Users can opt out by navigating to their "Settings and Privacy" under the "Data Privacy" section. However, opting out won’t affect data already collected.

Similarly, the platform formerly known as Twitter, now X, has been using user posts to train its chatbot, Grok. Elon Musk’s social media company has confirmed that its AI startup, xAI, leverages content from X users and their interactions with Grok to enhance the chatbot’s ability to deliver “accurate, relevant, and engaging” responses. The goal is to give the bot a more human-like sense of humor and wit.

To opt out of this, users need to visit the "Data Sharing and Personalization" tab in the "Privacy and Safety" settings. Under the “Grok” section, they can uncheck the box that permits the platform to use their data for AI purposes.

Regardless of the platform, users need to stay vigilant about how their online content may be repurposed by AI companies for training. Always review your privacy settings to ensure you’re informed and protected from unintended data usage by AI technologies

Preserving Literary Integrity: Indian Publishers Plead for Copyright Measures Against AI Models

 


It may become necessary to amend the Information Technology rules to ensure fair compensation and ensure that news publishers in India are fairly compensated for the use of their content in training generative artificial intelligence (GenAI) models in the wake of rising AI copyright disputes around the globe.

As a result of DNPA's letters to the ministries of information, electronics, and broadcasting, requesting safeguards against infringements of copyrights in the digital news space, it has requested safeguards against the use of artificial intelligence models that could cause copyright infringements. 

Having now gained a better understanding of the benefits of generative AI as well as its implications for content creators and publishers, In the report, Sujata Gupta, secretary general of the Downton National Planning Agency, is quoted as saying, "There is a chance to ensure that any company or LLM (large language model) uses data fairly and transparently in conjunction with compensating the sources from which the content or data used to train the model was derived." 

In recent decades, Artificial Intelligence (AI) technology has progressed rapidly, and this has had a significant impact on people's daily lives. In the past, people would search for information on Google and sift through a few results, but now they can use chatbots to receive answers to specific questions or generate content for specific searches. 

OpenAI is one of the more popular artificial intelligence (AI) models that anyone can use for conversational tasks. ChatGPT is a popular tool in this field. As part of the ChatGPT functionality, users will have the capability to ask questions, provide explanations, generate text, and engage in interactive text-based conversations on a wide range of topics, as discussed previously. 

According to DNPA, which represents 17 top media publishers in the country, including Times Group, which publishes ET, until the Digital India Act comes into effect, the DNPA is asking to amend the IT Rules. As a result, it is expected to replace the over-24-year-old IT Act, of 2000, and regulate artificial intelligence. 

In the past three months, the association has been addressing the concerns of the industry in talks with the ministries, according to Gupta. Earlier this month, the New York Times announced that millions of its articles had been used unlawfully to train Microsoft-backed OpenAI bots, which now compete with the news outlet as reliable information sources, in the US district court in Manhattan where it filed its December 27 lawsuit. 

The New York Times has not sought monetary compensation from the companies; however, it has claimed that the companies had gotten huge amounts of money in statutory and actual damages, according to the lawsuit it filed to enforce its rights to copy its innovative and unique works without authorization.

Companies were ordered to destroy any chatbots or training data created by using any copyrighted materials that might have been used by the companies. As mentioned, the company noted in its statement in April that it had already approached OpenAI in April, asking for a commercial agreement or the introduction of 'technological guardrails' in its next-generation technologies. 

Despite these efforts, none of them were able to be realised. As stated on January 10, OpenAI stated that it is discussing the NYT's lawsuit as overstated and irrefutable and provides journalism with the "transformative potential" of AI in a blog post on January 8. 

The term 'derivative works' is used in the context of deriving from existing works protected by intellectual property rights, for example, if they introduce variations from the original work, they may also be protected by the laws of intellectual property. 

A TalkGPT response is based on the model's learning from data and several pre-existing sources of input to its responses, which makes it a form of Generative Artificial Intelligence. Depending on the case, derivative works can either be created using works in the public domain or based on works that have explicit permission from the copyright holder. 

The degree of alteration that must be introduced to the original material for it to be considered a derivative work to qualify for copyright protection will depend on the type of work involved. The potential adequacy of translating certain works into another language is acknowledged, while others may demand a complete shift to an alternative medium. 

Essentially, the act of substituting a few words in a written piece proves insufficient to generate a derivative work; a substantial modification of the content becomes imperative. Furthermore, for a work to be considered derivative, it must encompass a sufficient amount of the original material, firmly rooted in its source. 

The ascendancy and widespread adoption of ChatGPT give rise to noteworthy concerns surrounding intellectual property, necessitating careful consideration. Amendments to existing copyright laws may be requisite to effectively address the distinctive challenges posed by advancements in AI technology. The legal implications associated with the use of such tools are likely to remain intricate and indeterminate until more definitive legislation is enacted.

AI Models Produces Photos of Real People and Copyrighted Images


The infamous image generation models are used in order to produce identifiable photos of actual people. This leads to the privacy infringement of numerous individuals, as per a new research. 

The study demonstrates how these AI systems can be programmed to reproduce precisely copyrighted artwork and medical images. It is a result that might help artists who are suing AI companies for copyright violations.  

Research: Extracting Training Data from Diffusion Models 

Researchers from Google, DeepMind, UC Berkeley, ETH Zürich, and Princeton obtained their findings by repeatedly prompting Google’s Imagen with image captions, like the user’s name. Following this, they analyzed if any of the images they produced matched the original photos stored in the model's database. The team was successful in extracting more than 100 copies of photos from the AI's training set. 

These image-generating AI models are apparently produced over vast data sets, that consist of images with captions that have been taken from the internet. The most recent technology works by taking images in the data sets and altering pixels individually until the original image is nothing more than a jumble of random pixels. The AI model then reverses the procedure to create a new image from the pixelated mess. 

According to Ryan Webster, a Ph.D. student from the University of Caen Normandy, who has studied privacy in other image generation models but is not involved in the research, the study is the first to demonstrate that these AI models remember photos from their training sets. This could also serve as an implication for startups wanting to use AI models in health care since it indicates that these systems risk leaking users’ private and sensitive data. 

Eric Wallace, a Ph.D. scholar who was involved in the study group, raises concerns over the privacy issue and says they hope to raise alarm regarding the potential privacy concerns with these AI models before they are extensively implemented in delicate industries like medicine. 

“A lot of people are tempted to try to apply these types of generative approaches to sensitive data, and our work is definitely a cautionary tale that that’s probably a bad idea unless there’s some kind of extreme safeguards taken to prevent [privacy infringements],” Wallace says. 

Another major conflict between AI businesses and artists is caused by the extent to which these AI models memorize and regurgitate photos from their databases. Two lawsuits have been filed against AI by Getty Images and a group of artists who claim the company illicitly scraped and processed their copyrighted content. 

The researchers' findings will ultimately aid artists to claim that AI companies have violated their copyright. The companies may have to pay artists whose work was used to train Stable Diffusion if they can demonstrate that the model stole their work without their consent. 

According to Sameer Singh, an associate professor of computer science at the University of California, Irvine, these findings hold paramount importance. “It is important for general public awareness and to initiate discussions around the security and privacy of these large models,” he adds.  

The Russian State Duma introduced a bill aimed at combating online pirates

A bill aimed at combating online piracy has been submitted to the State Duma of the Russian Federation. The document will allow copyright holders to independently enter links to sites with illegal content in a special register, after which these links will have to be removed from the search results on the Internet within six hours. Currently, this practice applies only to those companies that have signed the Anti-Piracy Memorandum.

"The fight against the spread of pirated content is extremely complex and requires the efforts of both the state, its supervisory and regulatory bodies, IT specialists, and the entire community of Internet users in general," said Andrey Trofimov, chairman of the Crimean Union of Journalists.

He added that it is necessary to fight not with ordinary users, but with distributors of pirated content, illegal file-sharing sites, and online cinemas.

Illegal online resources offering to watch any movie “for free” and “without registration” often contain malicious code.

Today, the level of viruses and targeted hacker attacks is extremely developed. Previously, in order to hack and penetrate, attackers offered the user something to download and install on a PC. Now it is enough, for example, to simply open the letter. This will trigger the launch of a program that encrypts data on your computer.

The Anti-Piracy memorandum has been in force in the country since 2018. The document was signed by the largest Russian Internet companies, including Rambler Group, Mail.ru Group and Yandex, as well as the copyright holders. According to the document, copyright holders submit links with pirated movies and TV series for consideration, and Internet sites remove them from search results. At the moment, its validity period is extended until August 1, 2021.

Recall, E Hacking News conducted an interview with one of the founders of a new startup Digital Witnessor (https://www.digitalwitnessor.com/) and lawyer Mr. Dhruv Bagri. He shared with us his knowledge about copyright, how to securely register it, quickly and easily, using Blockchain, and from a legal point of view.


Interview with Dhruv Bagri, founder of the copyright timestamped entity Digital Witnessor

 The world is changing, technology is changing. We conducted an interview with one of the founders of a new startup Digital Witnessor(https://www.digitalwitnessor.com/) and lawyer Mister Dhruv Bagri. He shared with us his knowledge about copyright, how to securely register it, quickly and easily, using Blockchain, and from a legal point of view.

If you have created your own software, your clothes design, a choreographic dance, wrote a poem and do not know how to register copyrights to your creation, how to protect your rights, then this article is for you.


  • Please introduce yourself to our readers.

My name is Dhruv Bagri, I am a  Lawyer at RDB Associates. We frequently work on matters relating to Intellectual Property protection, including a lot of copyright infringement work. I’m also one of the founders of the platform Digital Witnessor.


  • How would you describe Digital Witnessor?

We have developed a platform called Digital Witnessor that creates timestamps using blockchain on your works. This allows you to protect your intellectual property rights in just a few seconds. The timestamp is considered official proof of ownership, and this saves you a considerable amount of time and legal fees in case of infringements and helps in more than one way. As the Company is based out of Estonia and the Service provided has been structured, studied, and developed by industry veterans from Cyber Security Privacy Foundation PTE Ltd, a Singapore based cyber security company, it boasts of maintaining high levels of privacy in accordance with GDPR guidelines and also provides high levels of security protection to any and all content passing through the Platform.


  • Why copyright is so important?

A copyright is a right in rem, which means that the right exists on the person who created the work right from the time such work was created. The platform is created at a time when there is a lot of uncertainty in the law with regard to copyright. Music and Art and their associated businesses are booming in the last decade. All these come under copyrightable work.  So, the copyright timestamped entity that is Digital Witnessor helps protect individuals and companies against copyright theft.


  • Are companies secure from their own programmers/employees and third parties?

Typically, the company would be the copyright holder, even though an employee might create it on behalf of the company. That is usually the structure that is in place and is an industry-standard. However, there are times when the company would not be holding the copyright. And that basically implies that the company needs to go ahead and register the copyright with country-specific entities/registrars that are available within their respective jurisdictions, which would create a legally binding registration that could be affected in a court of law. However, without that, litigation becomes a big hassle when copyright has not been registered. It becomes harder to prove that the work is originally theirs. So, Digital Witnessor takes away this problem for the company. We will generate a timestamp for the company data that needs copyright using blockchain technology. In fact, it's just a hash that is created and that could stamp your creation. The main file would also not be required to be uploaded. A file would be stamped without giving us access to its contents in case of any sensitive and confidential information which creates a bit of a hesitation in the holder of the works as to providing such content to us. 


  • How can a timestamp be useful in court? It’s legal?

From a legal point of view, a proceeding that includes a hash-signed block is an electronic document that can serve as written evidence in court.

It would also be helpful in case you are applying for copyright after a particular period of time, for example, you need to apply for copyright because the company is selling its entity and the buying entity would require such IP rights to exist. Similarly, a company receiving investments, the investor would always be more favorable to companies holding IP rights as this would deem to be an intangible asset in the company books. So, a timestamp would help the registration authorities to access this document in itself and in determining the exact time on which such the work was created. That makes things simpler. Secondly, a timestamp would be binding in a court of law. Blockchain has been implemented in quite a few countries across the world. So, it would definitely be helpful in most of the countries around the world.

Timestamp plays the role of a virtual notary and is much more credible than the traditional one. Because nobody can alter the information on the blockchain, not even the Company and I think that is the beauty of this Product. 


  • What kind of blockchain - private/public are you using? Why?

We are using a public blockchain. Firstly, in a public blockchain, anyone can take part by verifying and adding data to the blockchain. Secondly, A public network is more secure due to decentralization and active participation. Thirdly, a private blockchain is more prone to hacks, risks, and data breaches/ manipulation. In a private blockchain, anyone who is overseeing the network can alter or modify any transactions according to their needs.


  • How does it work? For example, I am a designer and I want to copyright a shoe model. What should I do and how will it happen?

As I mentioned earlier, it can be uploaded on the platform. It is not necessary that the design in itself be uploaded onto the platform.

Post which the platform would timestamp that particular uploaded file, in this case, that file will contain a shoe design. Once that is timestamped and the credentials of the author are stamped, it enters the blockchain. 


It should be noted that the content of the original works is never available to be viewed on the blockchain or exposed publicly. It is not visible to us as and it's not visible to any third party either.


So, what we provide is a time stamping facility which allows you to do three things:

    •    Legally establish yourself as the copyright owner of the work.

    •    Legally establish the date of creation.

    •    Take legal action against anyone who infringes on your copyrighted work.

Ease in assignment and transfer of said Copyrighted works to 3rd Party entities and individuals 


  • We know that Digital Witnessor works together with legal company RDB Associates? What is the role of this company?

RDB Associates is a full-service multi-specialty law firm based out of Bangalore in India and with multiple offices across India. I am one of the founding partners of the firm, which started in 2017. We believe that in our country as well many people are not going to go and get their copyrights registered, or we see that people do that for their other available Intellectual property rights such as trademarks, industrial designs, Patents, etc.

But with copyright, no one really gives that extra push to get their works registered. So, we noticed that there were many infringement matters wherein copyrights were in question and it was very hard for even the opposing counsel and for us to prove that such and such copyright existed at a particular time or not.

We did find a way to prove that the creations are in fact created on those particular timelines. It made the process a little more streamlined and a little more simple especially since it's not easy for everyone to approach the registrar for the Copyright and requires properly drafted applications. With the introduction of the platform Digital Witnessor, one can do it in a few seconds and get the process of registration started with ease.


We have a separate intellectual property team that works on registration and cases of infringement. We are integrated into the whole aspect through the onboarding of our clients onto this Platform or giving legal opinions on whether copyright exists or not, sending out legal notices in case of any infringements, and so on and so forth.


What is the distinctive feature of your company from others on the market?

Presently there aren’t many timestamping companies. We don't technically provide the same service as other competitors in the market dealing with similar platforms. However, one of the features that is distinctive is that we provide for easy assignment of copyrights from the copyright owner to third parties. So, that is a great feature that is available on our platform.

However, our other main USP is that our platform is going to be used across the world. Most of the companies that exist are very jurisdictional specific, so they only apply to certain areas thereby limiting their rights to such certain jurisdictions alone. 


  • What are the benefits that a company would get by using the platform Digital Witnessor?

Some benefits that the company would get is primarily establishing their definite right in rem and streamlining the process of registering with applicable registrars/entities in their jurisdictions by making it much easier for registration of their work.

It will ease the process in a way that quicker decisions would be made regarding the infringement of copyrights. And individuals do not have to wait longer and go through a long, arduous litigation process to get justice. So we believe that in case of IP rights, it is important to establish definite rights and to not leave it open-ended whereby one invites liability. Streamlining the process is very important and that's the main benefit that the platform would be providing.


  • How do you see the company in 5 years?

We do have certain things lined up and planned for the next couple of years, for starters, the integration of the technology for agreements. Enforceability of contracts and agreement terms would be made much easier. So once this facility is provided, I think many companies would be or would in fact like using this platform just to streamline the internal processes as well.

But currently, I think we need to concentrate on copyright protection, and we shall take it one step at a time.


  • We've covered quite a bit in this conversation. Before we wrap up, is there anything else you'd like to share about?

I think we covered most of the aspects of the platform and its benefits.  Just looking forward to see how this develops, grows, and integrates itself into the market in the coming few months

Indian Copyright Office Asks for Executable File for Website Code?


India copyright office grants a series of rights to the developer of a computer program that protects his original creation legally. Under the Copyright Act, computer programming codes can be registered as ‘literary works’. As the program is safeguarded by copyrights, each subsequent modification or addition to the code containing sufficient originality will also be protected under the law. Generally, a computer program is preserved not by just one copyright but by a set of copyrights beginning from the first source code written till the last addition by the creator.

Although, source code and object code differ from each other, the copyright office views both of the code forms as equal for registration purposes – maintaining the notion that the source code and object code are just two distinct forms of the same copyrighted program.

Copyright ownership refers to a collection of rights that gives the creator an exclusive right to use the original creation like a song, literary work, movie, or software. It means that the original authors of works and the people/company to whom they have given authorization to are the only ones having exclusive right to reproduce the creation.

Recently, a company director applied for copyrights for his PHP and python program. However, to his surprise, the Indian copyright office started asking for an executable file. It’s a well-known fact that PHP code used in websites does not have an executable file, hence there was no possible way that the director could have provided the executable file for his PHP program. The question still remains how the officials at the Indian copyright office are not aware of the fact that there is no executable file for website code, moreover, why do they even require it in the first place?

In India, the Copyright Act, 1957 grants protection to the Intellectual Property Rights (IPR) of computer software. As per the definition in the Indian Copyright Act, Computer programs are classified as ‘literary works’. Accordingly, the rights of computer software are protected under the provisions of the Act.

Can open source software be bought?


Open-source softwares (OSS) are released under a special license that makes its source code available to the user to inspect, use, modify and enhance. It is a misunderstood term that these are not copyrighted, instead, they are copyrighted under a license that lets it users study, change and use its source code or services (depending upon the software) for commercial use. Some of the common open source softwares are Linux, Red Hat, Ubuntu, GitHub, FreeBSD, and fedora.


Just five years ago the tech world was quite critical and skeptical of open source softwares with Microsoft CEO Steve Ballmer calling Linux as 'cancer' and open source software as 'a communist threat' but OSS since then have come a long way with the success of Red Hat and Linux. Open source has given a silver lining to the underdog developers and defied the monopoly of tech giants giving power to small businesses and individuals to grow using their open-source code.

But what the open-source devotees don't know or don't stress on is that open source softwares can be bought and acquired by other commercial companies. The fix being that if they are open source how could they be bought, but even these have copyrights that can be bought and changed to closed source. And these OSS (open source softwares) are being acquired by lightning speed- IBM acquired Linux and Red Hat. Microsoft is portraying itself as "the open-source leader" by joining the  Open Invention Network (OIN) and acquiring GitHub.

Now, there are advantages if big companies take over these open-source software as these were not established with a business model and will run out but if companies like these buy out OSSs they can stay afloat and provide for their customers. But there's also a dark side to these acquisitions as these could mean the end of open source. With their rights sold, these open-source rights could be closed and their free service comes to an end. Though those who have used the open-source would not be affected as it is already licensed but any future version of the software could be closed.

Now, Microsoft says that “Microsoft is all-in on open source, we have been on a journey with open source, and today we are active in the open-source ecosystem, we contribute to open-source projects, and some of our most vibrant developer tools and frameworks are open source.” the same goes for IBM's Linux but these are big and popular software but what about small software with less distributes and copyrights, the dark cloud still hovers over them.

Microsoft Sues IP Address for Windows, Office Piracy

Microsoft has filed a lawsuit against an individual IP address that was reportedly attempting to activate a pirated version of Windows and Office. The IP address points to a Comcast office in New Jersey and is accused of trying to activate over 1,000 copies of the software.

It is unclear who the complaint is filed against as the lawsuit mentions “John Does 1-10” and the IP address (73.21.204.220).

The full complaint can be seen below.

“During the software activation process, Defendants contacted Microsoft activation servers in Washington over 2800 times from December 2014 to July 2017, and transmitted detailed information to those servers in order to activate the software,” Microsoft claims in the complaint.

Microsoft is suing for both copyright and trademark infringement and has asked the court to seize all copies of the unlicensed software.