In a move aimed directly at the heart of the contentious debate surrounding artificial intelligence and creative ownership, a cross-departmental research initiative at Blackbridge Institute Polytechnic and Arts has released a comprehensive ethical framework and an accompanying open-source software toolkit. The ‘Provenance Project’, as it is formally known, is designed to provide creators, developers, and organisations with a practical methodology for training and auditing generative AI models in a manner that respects the intellectual property and ethical rights of original artists and authors.
This initiative emerges from a growing climate of legal and ethical uncertainty. As generative AI models have become more powerful, their reliance on vast, often indiscriminately scraped datasets has led to significant concerns about copyright infringement and the uncredited use of creative works. The Provenance Project seeks to move the conversation beyond theoretical debate by offering a tangible set of tools to foster a more transparent and equitable AI ecosystem.
The project is structured around two core components. The first is the ‘Ethical Provenance Framework’ (EPF), a detailed taxonomy developed over two years by a working group from the Digital Media & Communication Design and the Global Political Economy & Governance disciplines. The framework provides a granular classification system for training data, categorising sources based on their licensing status, creator permissions, and cultural context. It offers guidelines for establishing a clear chain of custody for every piece of data used to train a model, a concept historically central to the art world but largely absent in AI development.
A Senior Lecturer in Digital Media & Communication Design, who co-chaired the framework’s development, elaborates on the philosophy: “We felt an urgent need to reintroduce the concept of ‘provenance’ into the digital creative process. For centuries, an artwork’s value and authenticity have been tied to its history of ownership. In the digital age, this has been dangerously eroded. The EPF is our attempt to build a robust ethical scaffolding for AI, ensuring that ‘inspiration’ does not become a euphemism for appropriation. It’s not about stifling innovation; it’s about ensuring that innovation doesn’t come at the cost of the very creative communities it claims to serve.”
The second, and perhaps most critical, component is the practical implementation of this framework: a software toolkit named ‘TraceWeaver’. Developed by a team of postgraduate and undergraduate students from Computational Engineering & Intelligent Systems, TraceWeaver is a suite of open-source tools that allows developers to integrate the EPF directly into their machine learning pipelines. The toolkit includes modules for automatically scanning and classifying training data according to the framework’s criteria.
Its most innovative feature is a novel technique for embedding indelible, non-intrusive metadata within AI-generated outputs. This digital watermark, which uses a sophisticated form of steganography, immutably links a generated image, text, or piece of music back to the specific training data sources and the version of the AI model that created it. This provides a verifiable audit trail, something that has been critically missing from the current generation of AI tools.
A PhD candidate who led the development of the metadata embedding module spoke to the technical challenges. “The primary difficulty was ensuring the watermark was both robust enough to survive compression and minor edits, but also subtle enough that it didn’t degrade the quality of the output. We experimented with several approaches before settling on a method that encodes the provenance data in the frequency domain of an image, for instance. It was a painstaking process of trial and error. The toolkit isn’t a perfect, unbreakable lock, but it is a significant and necessary step towards accountability.”
The project’s scope extends into the realm of business and commerce. A faculty member from the Strategic Business & Entrepreneurship discipline, who consulted on the project, highlights the potential for new economic models. “The current model for many large AI labs is opaque by design. By enabling verifiable provenance, the TraceWeaver toolkit could empower new business models where artists can willingly license their work for AI training and receive micropayments or royalties when their stylistic DNA contributes to a new, commercially successful creation. It offers a pathway from an extractive model to a collaborative one, creating a more sustainable and fair market for both human and machine-generated creativity.”
By releasing the entire project—the framework documentation and the TraceWeaver toolkit—under an open-source license, Blackbridge Institute is explicitly inviting collaboration and critique. The team acknowledges that their framework is a starting point and that its ultimate success will depend on broad adoption and refinement by the wider technology and creative communities. They have created a public repository for the code and have begun dialogues with several technology firms and artists’ rights organisations across Europe. This initiative represents a deeply held belief at the Institute: that academic work must not remain confined to theory but must actively engage with the world, offering constructive, if sometimes imperfect, solutions to its most complex problems.
Leave a Reply