Speech to Text (STT)

Speech to Text (STT)

Ikromjonov Ma'rufjon

Ushbu maqolada man Android Speech API bo'yicha o'zim tushungan, ishlatgan bilimlarim oz bo'lsada bo'lishmoqchiman. Bu bizlarga quyidagicha vazifalarni bajarishga yordam beradi og'zaki nuqtni matnga aylantirishga, ishlarni avtomatlashtirishga, xavfsizlik bo'yicha autentifikatsiya qilish, ma'lumotlarni kiritish, subtitrlar va tarjimalar qo'shish, robototexnika, o'yin sohalaridagi va yana turli qulayliklar olib keladi (yozish imkoniyati cheklanganlarga, yozish savodi yaxshi yoki tez bo'lmagan insonlarga).

Speech to Text (o'gzaki nutqdan text hosil qilish)

Android Speech API bir nechta tillarni qo'llab- quvvatlaydi. Hoz Google assistent yordamida o'gzaki nuqtdan text hosil qilishni ko'rsatmoqchiman.

1- qadam Internet uchun permission olamiz.

<uses-permission android:name="android.permission.INTERNET" />

2 - qadam. xmlni yozamiz.

<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <TextView
        android:id="@+id/text"
        android:text="Speech to Text"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        android:textSize="20sp"
        android:padding="16dp"
        app:layout_constraintVertical_bias="0.2"
        android:gravity="center"/>

    <ImageView
        android:id="@+id/buttonMicrophone"
        android:src="@drawable/ic_recorder"
        android:layout_width="150dp"
        android:layout_height="150dp"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintVertical_bias="0.8"/>
</androidx.constraintlayout.widget.ConstraintLayout>

3 - qadam. Kod qismi, bunda ImageButton bosilishiga SpeechRecognizerni RecognizerIntent orqali ishga tushuramiz. Eng muhim intent bu RecognizerIntent.ACTION_RECOGNIZE_SPEECH, va bunga bitta til qo'shish mumkin bu asosiy til bo'lsa RecognizerIntent.EXTRA_LANGUAGE_MODEL ni (bu odatda ko'pchilikda Engliz tili bo'ladi) lekin boshqa tildan foydalanmoqchi bo'lsanggiz RecognizerIntent.EXTRA_LANGUAGE ni ishlatishinggiz mumkin.

class MainActivity : AppCompatActivity() {

    private val REQ_CODE_SPEECH_INPUT = 100

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        buttonMicrophone.setOnClickListener {
            promptSpeechInput()
        }
    }

    fun promptSpeechInput() {
        val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "ru-RU")

        intent.putExtra(RecognizerIntent.EXTRA_PROMPT, getString(R.string.speech_prompt))

        try {
            startActivityForResult(intent, REQ_CODE_SPEECH_INPUT)
        } catch (e: ActivityNotFoundException) {
            Toast.makeText(
                this, "Sorry! Your device doesn\\'t support speech input",
                Toast.LENGTH_SHORT
            ).show()
        }

    }

    override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
        super.onActivityResult(requestCode, resultCode, data)

        if (requestCode == REQ_CODE_SPEECH_INPUT) {
            if (resultCode == Activity.RESULT_OK && data != null) {
                val message = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
                text.text = message[0]
            }
        }
    }

    fun updateResults(s: String) {
        Toast.makeText(this, s, Toast.LENGTH_LONG).show()
        text.text = s
    }
}


Natija onActivityResult() ga keladi bundan requestCode siz jo'natgan bilan bir xil bo'lsa, keyin resultCode RESULT_OK mi va data bormi shunga tekshirasiz.

1- rasm. nuqtdan text hosil qilishi.
2- rasm. Hosil qilingan textdan foydalanish.

Endi jami qo'llab - quvatladigan tillar ro'yxatini olmoqchi bo'lsak o'zimiz uchun BroadcastReceiver yozib olamiz.

public class LanguageDetailsReceiver extends BroadcastReceiver {
    List<String> mLanguages;
    MainActivity mSSL;

    public LanguageDetailsReceiver(MainActivity ssl) {
        mSSL = ssl;
        mLanguages= new ArrayList<String>();
    }

    @Override
    public void onReceive(Context context, Intent intent)
    {
        Bundle extras = getResultExtras(true);
        mLanguages = extras.getStringArrayList
                (RecognizerIntent.EXTRA_SUPPORTED_LANGUAGES);
        if (mLanguages == null) {
            mSSL.updateResults("No voice data found.");
        } else {
            String s = "\nQo'llab-quvatlanadigan tillar ro'yxati:\n";
            for (int i = 0; i < mLanguages.size(); i++) {
                s += (mLanguages.get(i) + ", ");
            }
            s += "\n Jami " + mLanguages.size()+" ta";
            mSSL.updateResults(s);
        }
    }
}

va MainActivityda quyidagicha o'zgartiramiz.

class MainActivity : AppCompatActivity() {

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
        val broadcast = LanguageDetailsReceiver(this)
        sendOrderedBroadcast(
            RecognizerIntent
                .getVoiceDetailsIntent(this), null, broadcast, null,
            Activity.RESULT_OK, null, null
        )
    }

    fun updateResults(s: String) {
        text.text = s
    }
}
qo'llab - quvatladigan tillar ro'yxati

Endi yana device STT ni qo'llab - quvatlaydimi yo'qmi bilish kerak bo'ladi. Buning uchun MediaUtil classini yozaylik, uning getMicrophoneAvailable() funksiyasi deviceda mikrofon mavjudmi degan savolga javob beradi.

class MediaUtil {
    companion object {
        //returns whether the microphone is available
        fun getMicrophoneAvailable(context: Context): Boolean {
            val recorder = MediaRecorder()
            recorder.setAudioSource(MediaRecorder.AudioSource.MIC)
            recorder.setOutputFormat(MediaRecorder.OutputFormat.DEFAULT)
            recorder.setAudioEncoder(MediaRecorder.AudioEncoder.DEFAULT)
            recorder.setOutputFile( File(context.cacheDir,"MediaUtil#micAvailTestFile").absolutePath)
            var available = true
            try {
                recorder.prepare()
            } catch (exception: IOException) {
                available = false
            }
            recorder.release()
            return available
        }

        //returns whether text to speech is available
        fun getTTSAvailable(context: Context): Boolean {
            val packageManager: PackageManager = context.packageManager
            val speechIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
            val speechActivities =
                packageManager.queryIntentActivities(speechIntent, 0)
            return speechActivities.size != 0
        }
    }
}


Shu yerga kelganda maqolamiz tugadi, STT bo'yicha yana ko'plab yaxshi maqolalar bor mavzuga qiziqqan bo'lsanggiz ular bilan tanishib chiqishni maslahat bergan bo'lardim, maqolada xatolar bo'lsa uzr, ular haqida gruppada xabardor qilsanggiz xursand bo'lardim va sizdan keyingi foydalanuvchiga xatolarsiz yetib borishiga oz hissagizni qo'shgan bo'lardinggiz. Yuqoridagi toliq kodlar shu linkdan korishinggiz mumkin.

Kanalimizdan uzoqlashmang bizni kuzatib boring.



Report Page